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Preface 


This  research  is  a  continued  effort  into  the  development  of  a  Transcendental  Function 
Processor.  The  processor  has  been  baselined  by  Mickey  Bailey  and  the  approximation 
functions  expanded  and  further  elaborated  to  encompass  a  larger  set  of  functions. 

Intra-processor  data  representation  is  discussed  and  alternate  forms  of  representing 
the  data  considered.  Signed-Digit  representation  is  discussed  in  great  detail  as  a  possible 
alternate  to  standard  binary  representation  inside  the  processor.  Signed-Digit  hardware  is 
presented  along  with  its  estimated  performance  parameters.  The  discussion  of  Signed-Digit 
representation  proves  to  be  the  greatest  thrust  of  this  thesis. 

I  would  like  to  thank  AFIT  and  ENG  in  particular  for  the  help  and  understanding 
during  this  thesis  effort.  Dr.  D’Azzo  and  Major  De  Groat  allowed  me  to  have  the  time  any 
motivation  for  me  to  complete  the  thesis.  I  would  also  like  to  thank  my  wife  and  family 
for  their  support  and  encouragement  throughout  the  Master’s  Degree  Program. 


Robert  Alan  Peterson 
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Abstract 

In  support  of  the  computation  requirements  of  complex  equations,  a  processor  which 
can  compute  elementary  transcendental  functions  with  high  throughput  is  becoming  a 
hard  requirement  for  many  systems.  In  particular,  the  computation  of  components  of  the 
Vector  Wave  Equation  are  becoming  bottlenecked  by  the  reduced  speed  of  the  processor 
when  computing  the  required  elementary  functions. 

To  speed  up  the  computation  of  these  type  of  functions,  a  pipelined  processor  with 
high  throughput  is  developed.  This  processor  will  compute  Sine,  Cosine,  Tangent,  Cotan¬ 
gent,  Arctangent,  Exponential,  Natural  Logarithm  and  Division  as  a  minimum.  The  ac¬ 
curacy  of  the  computations  will  be  greater  than  IEEE  double  precision.  The  majority  of 
the  approximation  algorithms  are  derived  from  Chebyshev  Polynomials,  due  to  their  er¬ 
ror  characteristics  and  compatability  with  a  pipelined  processor.  The  only  approximation 
algorithm  not  derived  from  Chebyshev  Polynomials  is  the  division  algorithm.  Division  is 
derived  from  an  iterative  form  of  a  power  series  which  has  a  similar  computational  form 
as  that  required  by  the  algorithms  developed  fr>m  Chebyshev  Polynomials.  To  prepare 
the  algorithms  for  implementation  in  a  pipelined  processor,  the  algorithms  are  regrouped 
and  rearranged  into  the  from  obtained  by  Horners’  method.  Then,  the  development  of  a 
unified  Transcendental  Function  Processor  is  reviewed. 

In  an  attempt  to  speed  up  the  computations  within  the  processor,  alternate  forms 
of  data  representation  are  investigated.  Signed-Digit  representation  offers  the  greatest 
potential  for  increased  speed  over  standard  binary.  This  increased  speed  is  due  to  the 
reduction  of  carry-barrow  propagation  delays  throughout  the  hardware  units.  Signed-Digit 
modules  are  developed  and  performance  estimates  given.  The  modules  are  then  described 
in  VHDL  and  simulation  results  presented.  From  the  VHDL  module  descriptions,  a  16 
digit  by  16  digit  multiplier  is  built  and  simulated. 
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SIGNED-DIGIT 

HIGH  SPEED  TRANSCENDENTAL 
:  JNCTION  PROCESSOR  ARCHITECTURE 


L  Introduction 

This  effort  studies  approximation  algorithms  for  various  functions  with  the  premise 
that  the  algorithms  will  be  implemented  in  a  pipeline  processor.  In  an  attempt  to  increase 
processing  speed  of  the  functions,  alternate  forms  of  data  representation  are  investigated. 

Approximation  algorithms  for  trigonometric,  exponential,  natural  logarithm,  and 
the  division  function  are  developed.  The  structure  of  the  approximation  functions  must  be 
developed  such  that  the  processors  pipeline  will  not  require  extensive  re-configuration  and 
control  between  the  computation  of  different  functions.  Once  the  algorithms  are  developed, 
a  unified  processor  can  be  designed  to  encompass  pre-processing,  pipeline  processing,  and 
po.'t-processing. 

A  pipeline  processor  can  increase  the  through-put  of  a  system;  however,  the  through¬ 
put  is  limited  by  the  processing  speed  of  the  slowest  stage.  To  increase  the  speed  of  the 
stages,  either  unique  processing  hardware  must  be  designed  or  the  data  must  be  repre¬ 
sented  in  a  form  which  permits  faster  computation.  This  thesis  looks  at  alternate  data 
representation  forms  which  reduce  the  carry-barrow  propagation  delays  during  computa¬ 
tions. 

Transcendental  Function  Processor  Background 

Approximation  algorithms  for  Sine,  Cosine,  Tangent,  Cotangent,  Arctangent,  Expo¬ 
nential,  and  Natural  Logarithm  have  been  long  known  and  are  quite  numerous,  [1,  2,  3,  4]. 
The  algorithms  were  derived  from  Chebyshev  Polynomials  which  are  expanded,  summed, 
and  regrouped  into  a  polynomial  function  of  x.  The  pre-processing,  pipeline  processing, 
and  post- processing  requirements  are  similar  for  each  function.  A  baseline  processor  was 
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defined  to  provide  IEEE  single  precision  accuracy  for  the  computations.  The  performance 
estimates  of  the  processor  are  based  on  the  speed  of  an  IEEE  single  precision  floating  point 
multiplier. 

Other  algorithms  which  have  been  investigated  include  the  CORDIC  algorithm  and 
other  ultra-spherical  polynomials,  [1,  4].  However,  the  primary  algorithm  of  their  in¬ 
vestigations,  other  than  those  developed  from  Chebyshev  Polynomials,  is  the  CORDIC 
algorithm.  The  CORDIC  algorithm  is  an  iterative  algorithm  which  can  not  be  realisti¬ 
cally  implemented  in  a  pipelined  processor.  Other  problems  involve  the  computation  of 
non-trigonometric  functions  to  which  the  CORDIC  algorithm  is  not  suited. 

Alternate  forms  of  data  representation  which  have  been  studied  include  the  Negative 
Base  Number  System,  Residue  Number  System,  and  Signed-Digit  Number  System,  [9,  10, 
12,  13].  Each  has  advantages  and  dis-advantages  associated  with  them  and  are  discussed 
further  in  Chapter  4. 

Objective 

The  objective  is  to  complete  the  development  of  the  approximation  algorithms  which 
are  to  provide  IEEE  double  precision  accuracy  while  investigating  alternate  forms  of  data 
representation  to  speed-up  their  processing.  Once  the  algorithms  are  developed  they  will 
be  mapped  onto  a  pipelined  processor  architecture. 

Scope 

The  scope  of  this  thesis  effort  is  to  extend  the  previous  work  done  on  the  develop¬ 
ment  of  approximation  algorithms  by  extending  the  precision  of  the  developed  algorithms. 
The  algorithm  for  division  will  be  developed  such  that  its  general  form  is  compatible  to 
the  processor  defined  by  the  algorithms  developed  from  Chebyshev  Polynomials.  A  unified 
processor  will  be  defined  to  encompass  the  processing  requirements  of  all  of  the  approxi¬ 
mation  functi'^r's.  Alternate  forms  of  data  representation  will  be  studied  and  their  benefits 
elaborated  with  emphasis  on  the  reduction  of  carry-barrow  propagation  delays. 
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Assumptions 

The  assumptions  made  in  this  effort  are  that  the  physical  size  of  the  processor  is 
not  limited.  There  are  no  attempts  to  determine  the  resulting  chip  area  that  would  be 
required  to  implement  the  processor.  It  is  assumed  that  the  processor  will  operate  in  an 
environment  where  the  pipelines  latency  will  not  cause  major  problems. 

Organization 

The  remained  of  this  thesis  is  organized  as  follows.  Chapter  2  is  the  rational  behind 
using  Chebyshev  Polynomials  for  approximations  in  the  Transcendental  Function  Proces¬ 
sor;  as  well  as  the  development  of  the  division  algorithm.  The  processors  hardware  is 
discussed  in  Chapter  3  with  a  breakdown  of  its  pre-processing,  pipeline  processing,  and 
post-processing  requirements.  Chapter  4  presents  alternate  forms  of  data  representation 
and  elaborates  on  Signed-Digit  representation  and  its  major  functional  units.  Chapter  5 
presents  the  basic  Signed-Digit  modules  used  to  construct  major  functional  units,  compo¬ 
nents  such  as  multipliers  and  adder/subtractors,  and  presents  SPICE  results  as  estimates 
of  their  preformance.  Chapter  6  builds  the  VHDL  descriptions  of  the  basic  modules  and 
instantiates  them  to  build  a  Signed-Digit  multiplier  with  an  accuracy  greater  than  IEEE 
double  precision.  This  multiplier  is  then  simulated  and  performance  estimates  presented. 
The  thesis  is  concluded  in  Chapter  7  with  final  conclusions  and  recommendations  for 
follow-on  research. 
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IL  Approximation  Methods  and  Algorithms 


Approximation  of  Transcendental  Functions 

By  definition,  transcendental  functions  are  functions  which  are  not  algebraic,  [7], 
Therefore,  they  cannot  be  expressed  in  terms  of  sums,  differences,  products,  quotients,  or 
roots.  The  only  way  to  evaluate  them  is  by  approximation,  which  leads  to  the  study  of 
approximation  methods,  or  algorithms.  E^ach  method  has  advantages  and  disadvantages 
associated  with  them.  This  study  looks  at  the  proven  methods  of  approximation  with  the 
idea  of  implementing  the  algorithms  in  hardware. 

There  are  hardware  limitations  which  constrain  the  total  class  of  approximating 
methods  to  looking  at  approximation  algorithms  which  employ  multiplication  and  addi¬ 
tion.  A  large  number  of  algorithms  use  quotient  and  root  functions.  In  hardware,  these 
functions  are  too  time  consuming  for  implementation  as  a  one  step  function  and  are  there¬ 
fore  discarded  as  not  viable  approximation  algorithms  for  implementation.  This  dramaticly 
narrows  the  class  of  approximation  methods.  The  remaining  approximation  algorithms 
may  then  be  compared  by  looking  at  the  error  characteristics  of  each. 

To  decide  which  algorithm  is  the  best,  the  term  best  must  be  clearly  defined.  In  this 
paper,  the  best  approximation  algorithm  is  the  one  which  requires  the  fewest  mathematical 
operations  and  gives  an  error  less  than  some  maximum  tolerable  error.  There  are  different 
types  of  error  which  are  of  interest  when  approximating;  each  may  specify  a  different 
algorithm  as  being  the  best.  If  the  the  error  associated  with  the  best  algorithm  is  defined 
as  the  average  difference  between  the  approximating  function  and  the  true  function,  across 
an  interval,  then  the  Least  Square  error  is  the  error  type  of  interest.  However,  if  the 
maximum  deviation  between  the  approximating  function  and  the  true  function,  across  an 
interval,  is  of  interest,  then,  the  type  of  error  specifying  the  best  approximation  algorithm  is 
termed  the  Maximum  Norm  error.  When  approximating  a  function  to  obtain  the  domain- 
range  pair  on  a  point-  for-point  basis,  the  Maximum  Norm  error  is  used  to  identify  the  best 
approximation  algorithm.  In  this  study,  this  is  the  type  of  error  used  to  determine  the  best 
algorithm.  Figure  2.1  shows  how  the  Least  Square  error  and  the  Maximum  Norm  error 
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differ,  given  the  magnitude  of  the  respective  errors  are  equal.  Note  that  the  error  function 
characterizing  the  Least  Square  error  is  near  zero  over  a  portion  of  the  interval;  however, 
the  maximum  deviation  is  greater  than  the  error  function  characterizing  Maximum  Norm 
error.  Since  the  domain  is  continuous  over  the  interval  of  interest,  the  maximum  magnitude 
of  the  error  is  used  to  compare  approximation  algorithms.  Maximum  Norm  error. 


Figure  2.1.  Least  Square  Error  Compared  to  Maximum  Norm  Error. 


Error  functions  associated  with  a  specific  approximation  algorithm  have  characteris¬ 
tic  shapes.  These  shapes  not  only  indicated  how  well  an  algorithm,  with  a  given  number 
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of  terms,  approximates  the  true  function,  they  give  an  indication  of  how  the  Maximum 
Norm  error  changes  as  the  number  of  terms  used  for  the  approximation  change.  These 
shapes  can  lead  to  the  selection  of  the  best  approximation  method  by  understanding  the 
relationship  between  the  Maximum  Norm  error  and  the  number  of  approximation  terms. 
The  error  function  associated  with  the  Taylor’s  series,  as  shown  in  Figure  2.2,  is  shaped 
like  a  parabola  with  zero  error  in  the  center,  or  the  point  of  differentiation.  As  the  number 
of  terms  in  the  approximation  function  increase,  the  smaller  the  error  is  at  the  end-points 
of  the  parabola,  corresponding  to  the  end-points  of  the  interval.  Eventually,  if  an  infinite 
number  of  terms  are  used  in  the  approximating  function,  the  error  at  the  end-points  be¬ 
comes  zero.  Therefore,  to  get  the  Maximum  Norm  error  below  a  specific  value,  the  number 
of  terms  required  is  determined  by  the  magnitude  of  the  error  at  the  end  points  while  the 
error  between  the  end  points  may  be  acceptable  with  considerably  fewer  terms.  The  er¬ 
ror  function  associated  with  approximation  using  Legendre  Polynomials  oscillates  around 
zero  with  the  magnitude  of  the  oscillations  increasing  as  the  end  points  of  the  interval  are 
approached.  Though  the  maximum  error  may  not  occur  at  the  end  points,  the  maximum 
error  is  near  the  end  points.  To  get  this  Maximum  Norm  error  below  a  specific  value,  the 
number  of  terms  required  is  determined  by  the  magnitude  of  the  osciUation  near  the  end 
points.  This  is  better  than  the  Taylor  series  since  the  maximum  error  does  not  correspond 
exactly  with  the  end  points  of  the  interval.  A  better  approach  is  to  have  the  error  oscillate 
with  equal  magnitude  around  zero.  Then,  as  the  number  of  terms  increase,  the  Maximum 
Norm  error  decreases  uniformly  across  the  interval.  This  equal  magnitude  oscillation  of 
the  error  is  termed  the  equal  ripple  property  [3].  The  equal  ripple  property  ensures  a  uni¬ 
form  maximum  error  across  the  interval,  unlike  the  Taylor  series  or  Legendre  polynomials 
which  achieve  excellent  approximations  near  zero  but  poor  approximations  at,  or  near,  the 
end  points.  The  approximation  algorithms  which  exhibit  the  equal  ripple  property  are  the 
algorithms  which  approximate  functions  using  Chebyshev  Polynomials. 

Approximation  algorithms  using  the  Taylor  series,  Chebyshev  Polynomials,  and  the 
Legendre  Polynomials  are  sub-classes  of  more  general  approximation  algorithms  using 
Ultra-spherical  Polynomials  [3].  The  general  form  of  the  Ultra-spherical  Polynomial  is 

PtHx)  =  C„(l  - 
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Figure  2.2.  Error  Function  Using  Taylor’s  Series  Approximations. 

where  Cn  is  a  constant  and  a  is  in  the  interval  (— 1  <  a  <  oo). 

A  general  analysis  of  approximations  using  Ultrar spherical  polynomials  shows  that, 
when  a  is  greater  than  —1/2,  the  amplitude  of  the  oscillations  of  the  error  function  in¬ 
creases  as  X  moves  away  from  the  origin.  Ultimately,  as  a  approaches  oo,  the  series  of 
Ultra-spherical  polynomials  describes  the  Taylor  series.  When  a  =  0,  the  ultra-spherical 
polynomial  corresponds  to  the  Legendre  Polynomial.  However,  when  a  is  less  than  —1/2, 
the  magnitude  of  the  oscillations  of  the  error  function  decrease  as  x  moves  away  from  the 
origin.  The  value  of  a  which  gives  the  equal  ripple  property  is  a  =  —1/2;  this  describes 
the  Chebyshev  Polynomial. 


Chebyshev  Approximation  Methods 

Chebyshev  Polynomials  are  orthogonal  polynomials,  similar  to  the  trigonometric 
functions  of  Sine  and  Cosine,  and  are  derived  from  the  more  general  class  of  Ultra-spherical 
polynomitds.  The  Chebyshev  polynomials,  T,  are  related  to  trigonometric  functions  by  the 
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identity 

T„(co8x)  =  cos  nx. 

From  this  identity,  and  the  functional  relations  of  the  Cosine, 

cosO  =  1, 
cosx  =  cos  I, 
cos2x  =  2(co8*x)  — 1, 

the  Chebyshev  Polynomials  may  be  derived. 

To(x)  =  1 
Ti(x)  =  X 
Tiix)  =  2x®-l 


Additional  Chebyshev  polynomials  are  found  by  the  recursions  formula 

Tn+lix)  =  2  ♦  r*(x)  —  Tn_l(x). 

(  The  expanded  Chebyshev  polynomials,  up  to  n  =  22,  are  given  in  [2].) 

When  approximating  a  function  with  Chebyshev  polynomials,  each  polynomial  is 
weighted  by  a  constant  and  then  summed. 

N 

^nTn{x)  where  -  1  <  a:  <  1 

n=0 

Since  the  Chebyshev  polynomials  exhibit  the  orthogonality  property,  odd  functions  require 
summing  of  only  the  odd  polynomials;  likewise,  even  functions  only  require  the  summing 
of  even  polynomials. 

The  weighting  constants  for  each  polynomial  are  computed  from  the  function 

2 

a„  =  —  /  /(cos  x)  cos  nx  dx 
X  Jo 

This  functions  is  not  simple  to  integrate;  however,  there  are  means  to  accomplish  the  in¬ 
tegration;  these  are  described  in  Appendix  A.  The  last  piece  of  information  required  to 
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completely  define  an  approximation  algorithm  using  Chebyshev  Polynomials  is  to  deter¬ 
mine  the  number  of  terms,  or  polynomials,  required  for  the  approximation.  To  do  this,  a 
relationship  between  the  maximum  tolerable  error  and  the  number  of  polynomials  required, 
such  that  the  Maximum  Norm  error  from  the  approximation  is  less  than  the  maximum 
tolerable  error,  is  needed.  This  relationship  is 


l<yv| « 


/^(x) 

2N-1JV! 


(2.1) 


From  this  relationship,  the  maximum  magnitude  of  the  error  can  be  approximated  for  any 
function,  given  the  number  of  polynomials  used  to  approximate  that  function. 


By  using  Equation  2.1  to  estimate  the  number  of  terms  required  to  have  an  error 
less  than  2“®®,  the  general  form  of  the  Chebyshev  polynomial  approximations  for  the 
transcendental  functions  of  interest  are 

9 


sin(-i)  =  a2n+lT2n+lix) 

^  n=0 

COS(%)  =  j]a2nT2„(x) 

^  n=0 

ff  15 

tan(-x)  =  2c2„+iT2„+i(z) 

^  n=0 

TT 

COt(-z)  =  53  ®2n+1^2n-).l(a;) 

^  n=0 

11 

arctan(z)  =  ^3  ®2n-n7’2„+i(z) 

n=:0 

n=0 

ln(z-»-l)  =  f^a„Tnix) 


n=0 


Approximating  with  Chebyshev  Polynomials  has  one  problem.  The  form  of  the 
approximation  algorithms  does  not  fit  well  into  a  pipelined  architecture.  This  is  due  to 
the  computation,  weighting,  and  summing  of  the  terms  as  the  approximation  progresses. 

/(z)  =  aoro(z)  +  a2T2{x)  +  a4T4{x)  -!-•••  +  a„r„(z) 
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However,  since  all  of  the  terms  are  polynomials,  each  term  may  be  expanded  and  regrouped, 
using  the  distributive  and  associative  properties,  to  form  a  single  polynomial  of  degree  N. 
This  eliminates  the  computation  and  weighting  of  each  polynomial  term.  However,  the 
parallel  summing  of  the  powers  of  the  resultant  polynomial  must  still  occur. 

/(x)  =  Ho  +  H2X*  +  B4X*  + - 1-  H„x” 

To  eliminate  this  problem,  the  approximation  polynomial  may  be  rearranged  by  using 
Horner’s  method  [8].  This  results  in  an  expression  which  is  computed  as  a  series  of  sum- 
product  stages  with  the  result  from  each  stage  used  as  the  input  for  the  next. 

fix)  =  Co(C2  -t-  x\C4  +  x\-  ■■iC„+  x^)  ■  ■  ■)))  (2.2) 

This  form  of  approximation  is  well  suited  for  a  pipelined  architecture.  However,  when 
manipulating  the  coefficients  of  the  Chebyshev  Polynomials  to  obtain  this  arrangement, 
precision  is  lost.  To  achieve  the  same  precision  as  that  specified  when  implementing  the 
approximation  using  the  Chebyshev  Polynomials  directly,  one  additional  term,  or  polyno¬ 
mial,  is  required. 

Division  Algorithm 

Division  is  performed  by  finding  the  reciprocal  of  the  divisor  and  multiplying  the 
result  to  the  dividend.  Chebyshev  polynomials  cannot  be  used  efficiently  for  the  approxi¬ 
mation  of  the  reciprocal  function.  Therefore,  alternate  methods  were  investigated. 

An  algorithm  is  sought  which  requires  only  the  sum  and  product  operations.  Also, 
the  algorithm  should  be  in  a  form  similar  to  the  general  form  defined  by  Horner’s  method. 
Equation  2.2.  The  algorithm  which  best  meets  these  requirements  is  an  iterative  form  of 
a  power  series  for  reciprocal  [2].  This  algorithm  has  the  form 

y;+i  =  y;(2-xy;)  (2.3) 

where  Yi  is  the  approximation  of  1/x  and  Vi+i  is  the  next  approximation.  This  iterative 
equation  differs  from  the  form  that  Horner’s  method  yields.  However,  Equation  2.3  can  be 
rewritten  as 

y;+i  =y;(2-hy,(o-x))  (2.4) 
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This  is  in  the  form  required  by  the  pipelined  architecture  presented  in  the  preceding  section. 
However,  there  are  two  sum-product  functions  required  for  each  iteration.  Therefore,  if 
the  iteration  gives  a  result  which  has  a  Maximum  Norm  error  less  than  some  specified 
error  value  then,  2k  sum-product  operations  are  required.  As  long  as  the  number  of 
iterations  required  is  less  than  one-half  the  order  of  the  highest  polynomial  used  for  the 
approximations  by  the  rearranged  Chebyshev  Polynomials,  no  additional  stages  in  the 
pipeline  are  required.  This  algorithm  also  requires  x  to  be  positive.  However,  Equation  2.4 
inverts  the  sign  of  x,  now  requiring  it  to  be  negative.  Sign  corrections  can  be  performed 
in  the  pre  and  post-processing  stages  of  the  architecture. 

The  number  of  iterations  required  to  achieve  a  Maximum  Norm  error  less  than  some 
specific  value,  e,  depends  on  the  magnitude  of  e  and  the  magnitude  of  the  error  in  Yq, 
where  Yq  is  the  initial  guess  of  the  reciprocal  and  must  be  computed  in  a  pre-processing 
stage.  If  the  initial  guess  is  defined  as 

Vo=(i)+A 

where  A  is  some  error  term,  then. 


Kg  =  ^i^-xW,and 
Y,  =  (i)-x»®A'«. 


The  ith  iteration  yield  an  error  term  of 

c,(x)  =  x2’-^A2‘ 


As  long  as  e,(x)  <  e  for  all  x  in  an  interval,  then,  Yi  =  l/x.  Once  the  maximum  toler¬ 
able  error,  e,  and  the  interval  of  x  defined,  then,  the  maximum  allowable  error  for  is 
determined  by  the  number  of  iterations,  i. 


A.(x)  = 
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As  the  number  of  iterations  increase,  the  required  accuracy  of  Yq  decreases. 

The  difficulty  of  the  reciprocal  algorithm  is  determining  how  to  compute  Yo-  To  make 
full  use  of  the  pipeline  hardware  required  to  compute  the  transcendental  functions  from 
the  preceding  section,  eight  iterations  of  the  reciprocal  algorithm  are  used.  Therefore,  the 
maximum  allowable  error  when  a:  =  1  is  A8(l)  «  0.85005  and  the  maximum  allowable 
error  when  x  =  1/16  is  A8(l/16)  w  13.45434.  A  linear  function  can  compute  Yq  for  all  x 
in  the  interval  (1/16  <  x  <  1)  and  give  an  error  less  than  AgC^).  The  linear  function  has 
the  form  Yo(x)  =  ax  +  6.  The  error  function  between  is  Yo(x)  and  1/x  is 

/  ^  f  +  hx  -  1 

e(x)  =  ax  +  6  —  (1/x)  = - . 

X 

The  absolute  value  of  the  error  generated  from  e(x)  must  be  less  than,  or  equal  to,  A8(x), 
the  initial  maximum  allowable  error,  for  all  x  in  an  interval.  The  best  linear  function  will 
not  give  the  line  that  bisects  the  function  1/x  because  the  error  of  the  linear  function  at 
the  upper  end  point  of  the  interval  must  be  less  than  the  error  at  the  lower  end  point. 
What  is  required  is  to  have  the  ratios  of  the  errors  at  the  end  points,  relative  to  their 
maximum  allowable  error,  equal.  By  analyzing  the  error  in  this  manner,  the  error  across 
the  interval  is  essentially  normalized.  The  normalized  error  function  is 


ax*  +  5x  —  1 

"  A^  “  2°  0625a.0.0625- 


(2.5) 


Because  of  the  shape  of  1/x  and  the  fact  that  it  is  being  estimated  by  a  linear  function, 
the  errors  at  each  end  point  are  negative.  There  is  also  some  point  between  in  which  the 
error  will  be  positive  and  a  maximum.  This  can  be  seen  from  Equation  2.5  by  realizing 
the  slope  of  the  line  approximating  1/x  must  be  negative,  giving  a  negative  a  in  the  error 
function  n(x).  Then,  the  numerator  of  n(x)  is  a  quadratic  which  opens  downward;  in  the 
intervals  of  interest,  the  denominator  is  always  positive.  Also,  in  order  to  get  the  best  fit 
for  the  approximation  line,  the  line  will  cross  1/x.  Therefore,  the  location  of  maximum 
positive  error  of  the  normalized  error  function,  n(x),  is  found  by  setting  the  first  derivative 
of  n(x)  equals  0.  This  results  in 


0  =  ♦  1.9375  *  x*  +  6  ♦  0.9375  ♦  x  +  0.0625). 
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As  long  as  *  does  not  equal  0,  the  location  of  the  maximum  positive  error,  Xc,  is  obtainable 
from  the  quadratic  term  above. 


where  A  =  1.9375a,  B  =  0.93756, andC  =  0.0625.  Since  a  is  negative  and  the  square  root 
term  is  positive  and  larger  than  B,  the  negative  of  the  square  root  term  gives  a  positive 
Xc-  In  order  to  minimize  the  normalized  error  over  an  entire  interval,  the  magnitude  of 
the  normalized  error  at  the  end  points  must  equal  the  magnitude  of  the  normalized  error 
at  Xc  and  be  of  opposite  sign.  The  magnitude  of  the  normalized  error  at  these  points  are 
the  maximums  for  the  interval.  As  long  as  this  maximum  is  less  than  1,  the  reciprocal 
algorithm,  with  eight  iterations,  will  converge  to  1/x  with  an  accuracy  better  than  e.  To 
find  a  and  6,  the  normalized  error  function  must  be  used  with  x  equal  to  the  end  points  of 
the  interval.  Then,  a  normalized  error,  less  than  1,  is  chosen.  This  results  in  two  equations 
with  two  unknowns  whereby  a  and  6  are  determined.  Then,  with  a  and  6,  Xc  is  computed 
and  the  normalized  error,  n(x),  when  x  =  Xc  is  compared  to  the  chosen  normalized  error 
used  to  determine  a  and  6.  If  the  normalized  errors  are  not  equal,  the  chosen  normalized 
error  is  changed  until  the  normalized  error  at  Xc  equals  the  chosen  normalized  error,  within 
desired  bounds.  By  using  the  linear  equation  and  a  and  6  in  the  pre-processing  stages,  the 
initial  estimate,  Yg,  will  always  cause  the  final  iteration  to  converge  to  within  the  required 
accuracy,  e. 

Summary  of  Algorithms 

All  of  the  algorithms  used,  with  the  exception  of  the  algorithm  for  the  approximation 
of  the  division  function,  are  based  on  Chebyshev  Polynomials.  This  is  due  the  the  error 
characteristics  of  Chebyshev  Polynomials  over  other  approximation  algorithms.  By  using 
an  algorithm  which  has  the  equal  ripple  property,  fewer  number  of  terms  are  required 
to  achieve  a  specified  precision.  Then,  by  regrouping  and  rearranging  the  polynomials, 
a  form  suitable  for  pipeline  processing  emerges.  The  approximation  algorithm  for  the 
division  function  is  based  on  an  iterative  power  series.  The  form  of  the  power  series  is 
compatible  to  the  form  obtained  from  the  modified  Chebyshev  Polynomials. 
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III.  Processor  Architecture 


Pre-processing  Stages 

The  pre-processing  stages  of  the  processor  converts  the  arguments  of  the  functions 
into  the  form  required  by  the  algorithms  implemented  in  the  pipeline.  The  conversion 
of  the  arguments  takes  the  form  of  scaling  and  sign  correction  to  prepared  them  for  the 
pipeline.  These  operations  of  the  pre-processor  are  fast  and  add  little  overhead  to  the 
entire  processor  function. 

Sine  and  Cosine  Pre-processing  The  Sine  and  Cosine  functions  are  computed  by 
using  only  the  regrouped,  rearranged,  Chebyshev  Polynomials  to  approximate  sin(ira:/2). 
This  eliminates  the  lookup  table  entries  for  the  coefficients  required  for  the  cos(7ra:/2) 
function  in  the  pipeline  and  reduces  the  overall  complexity  of  the  control  logic  for  the 
processor.  The  Cosine  function  is  related  to  the  Sine  function  by  the  identity 


The  first  step  in  the  pre-processing  stage  is  to  determine  if  the  Sine  or  the  Cosine  function 
is  being  called.  If  the  Cosine  functions  is  being  called  then,  the  argument  is  transformed 
to  an  argument  for  the  Sine  function  by  subtracting  it  from  7r/2.  If  the  Sine  function  is 
being  called,  then,  the  argument  passes  unaltered  to  the  next  stages  of  the  pre- processor. 
From  this  point  on,  the  pre-processing  stages  are  the  same  for  both  the  Sine  and  Cosine 
functions. 

The  required  range  of  the  argument  passed  to  the  pipeline  is  (-1  <*<!).  To 
prepare  the  processors  input  to  be  within  this  range,  the  input  is  multiplied  by  a  constant, 
2/ir,  and  the  result  is  factored  into  a  sign  component,  integer  component,  and  a  fractional 
component.  The  sign  component  gives  the  direction  of  rotation  for  the  functions  while  the 
integer  component,  with  the  sign  component,  gives  the  quadrant  of  the  argument.  If  the 
integer  component  is  odd  then,  the  fractional  component  is  subtracted  from  1.  Otherwise, 
the  fractional  component  is  unaltered.  The  sign  of  the  fractional  component  is  determined 
by  the  sign  component  xor’ed  with  the  next  least  significant  bit  of  the  integer  component. 
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Since  the  sign  component  is  required  to  be  stripped  out  of  the  argument,  leaving  the  integer 
and  fractional  components  both  positive,  the  multiplication  constant,  2/t,  to  the  argument 
may  instead  be  the  constant  — 2/7r.  Simple  logic  in  the  front  end  of  the  multiplier  selects 
which  constant  to  use.  This  choice  also  determines  the  sign  component.  The  maximum 
value  of  the  integer  component  required  is  only  two  least  significant  bits.  Since  the  integer 
component  is  positive  and  it  determines  which  quadrant  the  argument  is  in,  zero  to  three, 
two  bits  are  all  that  is  required  and  all  higher  bits  are  discarded. 

The  overall  pre-processing  requirements  for  the  Sine  and  Cosine  functions  are  shown 
in  Figure  3.1.  The  pre-processing  stages  are  controlled  by  the  command  word  directing  the 
processor  to  compute  the  Sine  or  the  Cosine  of  an  argument.  This  global  control  is  used 
only  to  select  whether  to  multiplex  x  or  jr/2  —  x  to  the  next  stages.  All  other  controls  for 
the  pre-processing  stages  are  local  control  signals  and  do  not  need  to  extend  beyond  the 
pre-processor. 

Tangent  and  Cotangent  Pre-processing  The  Tangent  and  Cotangent  pre-processing 
is  similar  to  the  pre-processing  requirements  of  Sine  and  Cosine.  The  identity 

/  TT 

tan  X  =  cot  I - X 

\2 

is  used  to  reduce  the  number  of  coefficients  in  the  look-up  tables  and  the  amount  of  control 
in  the  pipeline  by  computing  only  the  Cotangent  function  in  hardware  and  converting  the 
Tangent  arguments  to  Cotangent  arguments.  This  conversion  hardware  is  the  same  as  that 
required  for  the  Cosine  to  Sine  argument  conversions.  Therefore,  if  the  Tangent  functions 
is  to  be  computed,  the  argument  is  subtracted  from  x/2  and  the  resultant  argument  is 
operated  on  as  if  the  Cotangent  function  was  called.  The  next  step  is  to  scale  the  argument 
into  the  range  (  — 1  <  i  <  1)  and  extract  the  sign,  integer,  and  fractional  components  for 
the  computation  of  rotation  and  quadrant  of  the  argument.  This  is  the  same  as  the 
requirements  for  the  Sine-Cosine  argument.  The  argument  is  multiplied  by  2/7r,  or  — 2/7r, 
and  the  result  extracted  into  its  three  components.  The  least  significant  bit  of  the  integer 
component  is  used  to  select  whether  to  use  the  fractional  component  directly  or  to  subtract 
it  from  1.  If  the  integer  component  is  odd,  the  least  significant  bit  is  a  1,  then  the 
fractional  component  is  subtracted  from  1  to  give  the  correct  magnitude.  O'.herwise,  the 
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Figure  3.1.  Sine/Cosine  Pre-processing  Requirements. 
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least  significant  bit  is  0  and  the  fractional  component  is  unaltered.  The  sign  of  the  fractional 
component  is  determined  by  the  XOR  operation  of  the  sign  component  and  the  next  least 
significant  bit  of  the  integer  component.  Up  to  this  point,  the  hardware  requirements  for 
pre-processing  the  Tangent  and  Cotangent  arguments  is  the  same  as  that  required  for  pre¬ 
processing  of  the  Sine  and  Cosine  arguments.  However,  the  range  of  the  argument  for  the 
Tangent  approximation  is  (-njr/4  <  x  <  )r/2—  nir/A)  and  the  range  for  the  Cotangent 
argument  is  (— Jr/4  <  nitxl2  <  Jr/4).  The  final  pre-processing  step  is  to  multiply  the 
resultant  argument  by  2.  If  the  result  is  greater  than  1,  an  internal  error  is  generated  which 
indicates  that  the  argument  is  out-of-range  for  the  called  function  and  the  co- function,  in 
conjunction  with  the  division  function,  must  be  used  to  compute  the  required  result.  This 
constrains  the  computation  of  the  Tangent  and  Cotangent  functions  somewhat.  However, 
it  is  necessary  to  limit  the  length  of  the  pipeline  to  a  reasonable  number  of  stages.  One 
method  to  overcome  this  problem  is  to  increase  the  processors  control  section  such  that 
when  it  detects  an  out-of-range  error,  the  co-function  and  division  function  are  internally 
scheduled  and  performed  to  get  the  desired  results.  The  addition  of  control  logic  hardware 
must  be  weighed  against  the  alternative  of  having  software  check  the  arguments  before 
requesting  the  function  and  against  the  frequency  of  the  arguments  being  out-of- range. 

The  pre-processing  requirements  for  the  Tangent  and  Cotangent  functions  are  shown 
in  Figure  3.2.  Like  the  Sine  and  Cosine  functions,  the  global  control  is  used  only  to  select 
which  function  is  to  be  performed.  All  other  control  operations  do  not  extend  beyond  the 
pre-processing  stages.  The  out-of-range  error  is  used  as  discussed  above. 

Arctangent  Pre-processing  The  pre-processing  requirements  for  the  Arctangent  func¬ 
tion  is  described  in  [1]  and  reviewed  here.  The  range  of  the  argument  required  for  the 
pipeline  is  (  — 1  <  a:  <  1).  If  the  argument  of  the  Arctangent  function  is  within  this  range 
it  may  be  given  directly  to  to  pipeline  for  computation.  However,  if  the  argument  is  outside 
this  range,  the  trigonometric  identity 

arctan(x)  =  x/2  -  arctan(l/x) 

must  be  used.  The  error  signal  must  be  generated  and  either  handled  internally,  by  the 
control  section  scheduling  the  proper  operations,  or  by  software  to  compute  the  desired 
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Figure  3.2.  Tangent /Cotangent  Pre-processing  Requirements. 
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Figure  3.3.  Arctangent  Pre-processing  Requirements. 

value.  Figure  3.3  shows  the  pre-processing  requirements  for  the  Arctangent  function.  This 
differs  from  Figure  3.4  in  [1]  due  to  the  realization  of  the  control  section  having  to  schedule 
the  reciprocal  operation  as  a  separate  function  and  not  just  a  pre-processing  operation. 

Exponential  Pre-processing  The  pre-processing  requirements  for  the  Exponential 
function,  as  described  by  [1],  requires  x  be  decomposed  into  an  integer  and  fractional 
value. 

e*  =  ♦  e^ 

The  integer  portion,  e^,  is  evaluated  by  using  a  ROM  table  to  look-up  the  result.  For  IEEE 
single  precision,  the  required  ROM  table  is  89  words  deep;  for  double  precision,  the  ROM 
table  is  712  words  deep.  The  fractional  component,  c^,  is  computed  by  submitting  F  to 
the  pipeline  for  computation.  The  integer  and  fractional  results  are  then  multiplied  in  the 
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Figure  3.4.  Exponential  Pre-processing  Requirements. 

post-processor.  If  x  is  negative,  an  internal  error  is  generated  and  the  control  section  may 
either  schedule  the  exponential  and  division  functions  for  x  or  generate  an  external  error 
and  let  the  software  handle  the  error.  Figure  3.4  shows  the  pre-processing  requirements  for 
the  exponential  function.  The  extractor  separates  the  argument  into  an  integer  part  and 
a  fractional  part.  The  integer  part  is  used  to  find  the  value  of  in  the  ROM  table  while 
the  fractional  part  is  operated  on  in  the  pipeline.  If  the  integer  value  is  larger  than  the 
depth  of  the  ROM  table,  an  overflow  error  is  generated.  This  error  signifies  that  the  value 
of  is  larger  than  the  largest  value  which  can  be  represented  in  the  data  representation 
form,  such  as  IEEE  single  or  double  precision. 

Natural  Logarithm  Pre-processing  The  pre-processing  requirements  for  the  Natural 
Logarithm  function  is  complex  and  well  described  by  [l].  Presented  here  is  an  overview  of 
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the  requirements  in  order  to  get  an  understanding  of  the  full  pre-processing  requirements 
of  the  processor. 

To  compute  the  ln(a:)  by  using  Chebyshev  approximation,  ln(a:-|-l)  must  be  computed 
where  a:  -I- 1  must  be  in  the  interval  (0.7071  <  x  -i- 1  <  1).  To  scale  x  -f  1  to  this  range,  the 
identity 

In  x*  =  y  In  X 

is  used  to  separate  the  exponent  of  the  argument  from  the  mantissa.  The  exponent  is 
then  used  in  the  post-processor  stages.  The  mantissa  is  then  scaled  by  a  value  which  is 
selected  by  the  magnitude  of  the  mantissa  in  order  to  get  a  result  in  the  required  range. 
The  identity 

In  mn  =  In  m  -1-  In  n 

is  used  to  justify  the  scaling  and  later  subtraction  of  the  Natural  Logarithm  of  the  scaling 
factor  from  the  pipelines  result  in  the  post-processing  stages.  Figure  3.5  shows  the  pre¬ 
processing  requirements  for  the  Natural  L<^arithm  function. 

Division  Pre-processing  The  pre-processing  requirements  for  the  division  function 
consist  of  sign  correction  of  the  divisor,  extraction  of  the  mantissa  and  exponent  of  the 
divisor,  and  the  computation  of  the  initial  guess,  Yq.  The  algorithm  implemented  in  the 
pipeline  requires  the  divisor  to  be  positive.  Therefore,  if  the  divisor  is  negative,  the  numera¬ 
tor  and  denominator  are  both  multiplied  by  —1.  This  performs  the  required  sign  correction 
for  the  divisor  without  any  additional  requirements  imposed  on  the  post-processor.  The 
exponent  and  mantissa  are  separated  and  the  mantissa  shifted,  with  a  corresponding  ad¬ 
justment  of  the  exponent,  such  that  it  is  in  the  range  (1/16  <  M  <  1).  The  exponent  is 
then  operated  on  separately  from  the  mantissa.  The  mantissa  is  then  used  as  the  argument 
of  a  linear  function  to  compute  an  initial  guess  of  the  reciprocal,  Lq.  The  linear  function, 

Yo  =  aM  +  b 

is  used  where  a  and  b  are  both  constants  which  are  not  dependent  on  the  value  of  M.  Yq 
and  M  are  then  given  to  the  pipeline  for  computation  of  the  reciprocal  of  the  denominator 
while  the  numerator  and  the  exponent  of  the  denominator  are  sent  to  the  post-processor 
for  eventual  processing. 
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Figure  3.5.  Natural  Logarithm  Pre-processing  Requirements. 
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Figure  3.6  shows  the  pre-processing  requirements  for  division.  At  a  minimum,  the 
control  hardware  must  detect  a  zero  denominator;  and,  the  control  hardware  could  be 
increased  to  detect  a  zero  numerator. 

Unified  Pre-processor  A  Unified  Pre-processor  combines  all  of  the  requirements  of  the 
preceding  sections  and  establishes  one  pre-processor  to  handle  them  all.  This  Unified  Pre¬ 
processor  can  take  on  many  different  forms,  the  best  form  is  not  necessarily  the  best  for  all 
environments.  The  architecture  of  the  pre-processor  is  dependent  on  the  frequency  of  each 
operation  requested.  If  a  certain  function  is  not  requested  often  and  it  has  a  unique  pre¬ 
processing  requirement,  then  the  architecture  of  the  pre-processor  will  take  on  a  different 
configuration  than  it  would  if  the  function  was  requested  more  often.  In  general,  the 
configuration  of  the  Unified  Pre-processor  will  have  to  consist  of  a  bus  arrangement  where 
data  can  be  inserted,  and  pulled  from,  different  points.  By  simple  analysis,  the  two  extreme 
pre-processing  requirement  are  those  of  the  Tangent /Cotangent  functions  and  the  Division 
function.  Much  of  the  hardware  required  for  pre-processing  of  the  Tangent/Cotangent 
functions,  as  well  as  its  layout,  is  suitable  for  most  of  the  other  functions.  The  extractor 
stage  can  be  constructed  such  that  it  is  more  general  in  nature,  giving  the  fractional, 
integer,  sign,  exponent,  and  mantissa  components. 

The  exact  layout  of  a  Unified  Pre-processor  requires  a  great  deal  of  analysis  of  in¬ 
struction  frequencies  before  it  can  be  properly  designed. 

Pipeline  Architecture 

The  pipeline  architecture  is  designed  for  the  computation  of  algorithms  which  have 
been  regrouped  and  rearranged  such  that  they  are  expressed  in  the  form  generated  from 
applying  Horners’  Method  [8].  This  yields  a  series  of  sum-product  stages  with  each  stage 
feeding  the  next.  The  algorithms  developed  by  regrouping  and  rearranging  Chebyshev 
polynomials  all  have  a  similar  form,  with  one  exception.  The  even  functions  only  use  even 
powers  of  the  argument  presented  to  the  pipeline  while  odd  functions  only  use  odd  powers. 
Functions  such  as  the  Exponential  use  both  even  and  odd  powers.  However,  when  all  of 
the  functions  are  expressed  using  Horners’  Method,  only  i  and  are  required.  Even 
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Figure  3.6.  Division  Pre-processing  Requirements. 
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functions  only  use  the  term, 

fevenix)  =  Cq  +  X^(C2  +  X*(-  ••(€„  +  X*)  •  •  •)), 

while  odd  functions  use  both  terms, 

fodd(x)  =  x(ci  +  X*(C3  +  x\  •  •  (Cm  +  X*)  ■  •  ■)))• 

Functions  which  are  neither  even  nor  odd  only  use  the  i  term. 

fneither(x)  =  Cq  +  x(Ci  +  x(-  •  •  (Cfc  +  x)  •  •  •)) 

Therefore,  the  first  stage  of  the  pipeline,  as  shown  in  Figure  3.7,  takes  its  argument  and 
squares  it.  Then,  the  argument  and  its  square  are  propagated  down  the  entire  length  of 
the  pipeline,  with  the  pipeline  control  section  selecting  the  argument  to  use,  depending  on 
the  function  being  computed.  The  control  section  also  selects  the  coefficients  to  sum  with 
the  product  result  from  the  previous  stage. 

This  leads  to  the  development  of  a  control  pipeline  where,  as  the  data  advances  down 
the  data  pipeline,  a  control  word  advances  down  a  control  pipeline,  selecting  coefficients 
and  arguments  for  the  data  pipeline  at  each  stage. 

The  division  algorithm  is  the  only  algorithm  not  derived  from  Chebyshev  Polynomi¬ 
als.  Its  general  form  is 

Ti+x  =  ri(2-i;(o-^x)). 

The  general  form  shows  the  requirement  of  being  able  to  block  the  propagation  of  x^ 
down  the  the  data  pipeline  and  replacing  it  with  Yi  at  select  points.  This  can  easily  be 
accomplished  by  the  control  word  selecting,  through  the  use  of  a  multiplexer,  whether  to 
propagate  x^  or  the  output  of  the  previous  stage  down  the  pipeline. 

The  total  number  of  sum-product  stages  in  the  pipeline  is  developed  around  the 
requirement  of  obtaining,  at  a  minimum,  IEEE  double  precision  accuracy.  The  algorithm 
requiring  the  greatest  number  of  sum-product  stages  is  the  algorithm  which  computes 
Tangent/Cotangent.  This  algorithm  requires  16  sum-product  stages  to  achieve  double 
precision  accuracy,  even  with  the  limited  range  of  its  argument. 
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Figure  3.7.  Stage  One  of  Pipeline. 


Figure  3.8  shows  how  the  architecture  of  the  pipeline  is  constructed.  A  total  of  16 
sum-product  stages  follow  the  initial  squaring  stage.  The  control  word,  passing  down  the 
control  pipeline,  selects  coefficients  and  arguments  for  use  in  each  sum-product  stage  as 
well  as  the  argument  to  be  propagated  down  the  argument  pipeline.  The  result  from 
the  last  stage  is  given  to  the  post-processor  for  computation  of  the  final  result. 

Post-processor 

There  is  no  requirement  of  post-processing  for  the  Sine/Cosine,  Tangent /Cotangent, 
and  Arctangent  functions.  The  result  from  the  last  stage  of  the  pipeline  is  the  value  which 
requires  scheduling  for  return  to  memory  or  for  additional  processing.  The  Exponential 
function  requires  a  multiplier  in  the  post- processor  to  multiply  the  result  of  the  last  pipeline 
stage  to  the  value  obtained  from  a  ROM  table.  This  result  can  then  be  scheduled  for  return 
to  memory.  The  Natural  Logarithm  function  requires  a  subtractor  to  subtract  the  bias 
out  of  the  exponent,  a  subtractor  to  subtract  the  Natural  Logarithm  of  the  scaling  factor, 
obtained  from  a  look-up  table,  from  the  result  of  the  last  stage  of  the  pipeline,  and  a 
multiplier  to  multiply  the  two  intermediate  result.  The  result  from  the  multiply  operation 
can  be  scheduled  for  return  to  memory.  The  post-processing  requirements  of  the  Division 
function  consist  of  a  subtractor,  complementor,  and  an  adder  for  the  exponent  to  compute 
the  negative  exponent  of  the  denominator.  A  multiplier  is  also  required  to  multiply  the 
reciprocal  of  the  denominator  and  the  sign  adjusted  numerator  to  obtain  a  finad  result 
which  can  then  be  scheduled  for  return  to  memory. 

The  architecture  of  a  unified  post- processor  depends  directly  on  the  level  of  complex¬ 
ity  of  its  control  section.  At  one  extreme,  the  control  section  is  relatively  simple,  having 
dummy  stages  in  the  post- processor  such  that  all  functions  require  the  same  number  of 
clock  cycles  through  the  post-processor.  At  the  other  extreme;  a  complex  control  sec¬ 
tion  has  each  post- processor  operation  selected  by  the  control  logic  and  the  results,  which 
require  minimum  computation,  are  scheduled  for  return  to  memory  before  results  which 
require  more  computation,  even  though  they  may  have  arrived  at  the  post-processor  first. 
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Figure  3.8.  Pipeline  Architecture. 


IV.  Intra-Processor  Data  Representation 


Alternate  Data  Representations 

The  Transcendental  Function  Processor  requires  a  look  into  alternate  data  repre¬ 
sentation  schemes.  The  motivation  behind  this  is  to  achieve  the  greatest  speed  from  the 
algorithms  and  hardw  are  designs  before  looking  at  the  speed-up  possible  from  different 
technologlies  used  to  construct  the  hardware.  By  looking  at  alternate  data  representation 
schemes,  the  hardware  design  advantages  may  be  analyzed. 

The  large  number  of  sum-product  stages  in  the  processor  warrant  the  analysis  of  data 
representation  schemes  which  can  make  the  computations  faster.  The  primary  method  of 
speeding  up  the  multiplication  and  addition  operations  is  by  reducing  the  carry-barrow 
propagation  delay  throughout  each  hardware  component.  The  problem  of  propagation 
delays  of  the  carry  is  not  a  significant  problem  with  exponents  but  it  is  significant  with 
mantissa  values.  This  difference  is  due  to  the  relative  sizes,  or  number  of  bits,  of  each. 
The  number  of  bits  in  the  exponent  of  an  IEEE  double  precision  numbers  is  11  whereas 
the  number  of  bits  of  the  mantissa  is  52.  The  propagation  delay  across  52  bits  is  signifi¬ 
cant.  There  have  been  many  methods  proposed  to  eliminate  the  problem  of  carry-barrow 
propagation  delays.  Data  representation  schemes  which  have  been  studied  in  great  depth 
include  the  Residue  Number  System,  and  the  Signed-Digit  Number  System,  [9].  The 
Residue  Number  System  is  a  digit  oriented  system  where  no  weighting  factor  is  assigned 
to  any  digit.  Instead,  a  residue  number  is  represented  by  an  n-tuple,  n,  which  relates  to 
another  n-tuple,  m,  where  m  is  a  set  of  relatively  prime  numbers  and  n  is  a  set  of  numbers 
which  represent  a  modulo  factor  of  each  element  in  m  such  that  the  sum,  for  all  pair  wise 
elements  in  the  sets,  is  the  value  of  n.  The  major  problems  with  this  system  are  the  digit 
set  pairing  and  normalization  of  a  residue  number  is  not  practical.  Therefore,  precision  can 
not  be  maintained  for  all  representable  values.  A  number  system  similar  to  the  Residue 
Number  System  is  the  Negative  Base  System;  however,  it  has  the  additional  complexity  of 
determining  the  sign  of  the  number. 
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The  Signed-Digit  Number  System  is  a  system  which  allows  for  a  great  amount  of 
flexibility.  A  number  is  represented  by  a  set  of  digits  where  each  digit  can  only  take  on  a 
value  in  the  set  Dp.  The  digit  set,  Z?p,  is  a  balanced  set  where  both  tj  and  -//  are  elements 
and  (—p  <»?</>).  A  Signed-Digit  (SD)  number  is  composed  of  digits  which  are  positional 
weighted  using  some  radix.  This  gives  a  degree  of  redundancy  to  the  representation  a 
number  depending  on  the  value  of  p  in  Dp. 

Regardless  of  what  alternate  data  representation  form  is  used,  there  is  a  cost  associ¬ 
ated  with  using  it.  The  costs  occur  from  the  requirement  to  convert  numbers  represented 
in  the  conventional  form  to,  and  from,  the  alternate  representation.  As  long  as  these  costs 
are  out  weighed  by  the  benefits  of  the  alternate  representation,  the  alternate  representation 
should  be  considered. 

Signed-Digit  Data  Representation 

As  stated  previously,  a  Signed-Digit  number  is  composed  of  a  set  of  digits  where 
each  digit  is  positionally  weighted  and  is  an  element  of  the  digit  set  Dp.  SD  number 
representation  has  the  primary  advantage  of  being  free  of  carry  propagation  delays.  The 
SD  Number  System  has  four  basic  properties  associated  with  it  [13,  12]. 

1.  The  radix  r,  associated  with  the  positional  weighting,  is  a  positive  integer. 

2.  Zero  is  represented  by  a  unique  set  of  digits. 

3.  Totally  parallel  addition  and  subtraction  are  possible. 

4.  There  exist  transforms  between  conventional  data  representation  schemes,  such  as 

IEEE  form,  to  SD  representations. 

The  SD  number,  Z,  is  expressed  as 

Z  =  {Zq,  Zi,  Zj,  Z3,  •  •  • ,  Zn} 

corresponding  to 

Z  =  Zor°  -f-  Zjr  *  -J-  Z2r  ^  -|-  •  •  •  -f-  Znf  "• 
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Each  digit  in  Z  is  an  element  of  the  digit  set  Dp  where 

Dp  =  l,p} 

In  general,  the  maximum  value  of  p  is 

Pmax  ^  1 


and  its  minimum  value 


Pmin  ^ 


The  above  are  general  constraints  defined  by  [12].  More  specific  constraints  on  p  defined 
by  [13]  are 


Pmax  ^  ^  2 


and 

Pmin  >  \-^]  +2- 

The  more  restrictive  constraints  on  p  simplifies  the  normalization  procedure  of  a  SD  num¬ 
ber.  Another  feature  of  SD  numbers  is  that  each  digit  carries  it  own  sign  and  the  sign  of 
the  SD  number  is  given  by  the  sign  of  its  most  significant  non-zero  digit. 

Because  the  digit  set  Dp  is  balanced  and  each  digit  carries  it  own  sign,  numbers 
represented  as  SD  may  have  a  degree  of  redundancy  associated  with  them.  A  minimally 
redundant  SD  Number  System  is  defined  as  one  where 


;  if  r  =  16,  this  defines  a  digit  set  where  p  =  8.  Using  this  digit  set,  and  two  digits  to 
represent  a  number,  only  one  number  can  be  represented  in  a  redundant  manner.  For 
example,  if  the  number  0.5  decimal  is  the  number  to  be  represented  using  a  minimally 
redundant  digit  set  where  r  =  16,  it  may  be  expressed  as 


Z  =  (l)r0  4-(-8)r-i 


or  as 

Z  =  (0)rO -b  (8)r-*. 
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No  other  number  may  be  expressed  in  this  redundant  fashion.  In  a  maximumly  redundant 
digit  set,  one  where 


p  =  r  -  1, 

all  numbers  except  0  are  representable  in  a  redundant  manner.  Zero  is  not  representable 
in  a  redundant  manner  because  Pmax  =  r  —  1  and  a  redundant  representation  of  zero 
violates  one  of  the  four  basic  properties  of  the  Sign-Digit  Number  System.  The  level  of 
redundancy  in  a  chosen  system  effect  other  aspects  than  simply  the  way  which  numbers 
can  be  represented.  When  a  maximumally  redundant  digit  set  is  chosen,  the  conversion 
transform  between  conventional  representations  and  SD  representations  is  simple.  How¬ 
ever,  the  normalization  procedure  is  made  complex.  The  opposite  is  true  for  a  minimally 
redundant  digit  set,  conversion  is  difficult  but  normalization  is  simple.  The  digit  set  for 
any  SD  Number  System  will  range  between  these  to  extremes.  When  selecting  the  digit 
set,  done  by  the  selection  of  p,  the  tradeoffs  between  the  chosen  degree  of  redundancy 
and  the  complexity  of  the  hardware  must  be  examined.  In  a  system  where  a  number  is 
converted,  used  extensively,  and  then  assimilated  back  to  a  conventional  representation, 
the  frequency  of  the  conversion  process  is  much  less  than  the  frequency  of  normalization. 
Therefore,  in  this  system,  a  digit  set  which  is  minimally  redundant  should  be  chosen.  The 
opposite  is  true  when  the  frequencies  of  conversion  and  assimilation  approaches  the  fre¬ 
quency  of  normalization.  The  majority  of  the  work  presented  in  literature  [13,  12,  15] 
has  shown  that  when  in  an  environment  where  the  frequency  of  normalization  is  greater 
than  the  frequency  of  conversion,  such  as  in  a  pipelined  processor,  p  =  10  yields  the  best 
tradeoff  between  conversion  and  normalization  complexities. 

The  normalization  of  a  SD  number  is  preformed  by  the  shifting  of  digits  and  adjusting 
of  the  exponent.  A  SD  number  is  normalized  if 

1.  The  most  significant  digit,  |Zo|  is  1  and  |Zo  -|-  Zir“*|  <  1  or 

2.  If  Zo  =  0  and  jZjr'^  -|-  Zjr”^)  >  or 

3.  If  all  of  the  digits  are  0. 

Since  normalization  shifts  digits  and  not  bits,  the  exponent  is  adjusted  by  the  binary 
equivalent  of  the  log  base  2  of  the  radix  for  each  shift.  The  exponent  of  a  SD  number  may 
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be  represented  in  either  SD  or  conventional  form;  however,  by  keeping  it  in  a  conventional 
form,  the  conversion,  assimilation,  and  alignment  processes  are  kept  relatively  simple. 
However,  during  the  alignment  process  for  addition,  if  the  exponents  are  not  the  same  or 
some  multiple  of  the  log  base  2  of  the  radix  apart,  alignment  cannot  occur.  Therefore, 
during  the  conversion  process,  the  exponents  must  be  adjusted  such  that  all  numbers 
represented  in  SD  form  have  exponents  which  are  a  multiple  of  the  log  base  2  of  the  radix 
apart.  This  is  done  by  shifting  the  conventionally  represented  input  such  that  the  n  least 
significant  bits,  where  2”  =  log2  r,  are  the  same  for  all  SD  number  exponents.  When  the 
radix  equals  16,  the  two  least  significant  bits  of  the  exponent  are  required  to  be  the  same. 

Signed-Digit  Numeric  Units 

The  SD  numeric  units  for  the  processor  consist  of  the  conversion,  adder/subtractor, 
multiplier,  and  assimilation  units.  The  conversion  and  assimilator  units  have  only  a  single 
input  while  the  adder/subtractor  and  multiplier  each  have  two  primary  inputs.  The  pro¬ 
cessor  represents  SD  numbers  with  radix-16  weighting  and  a  minimally  redundant  digit 
set,  Pmax  =  10.  Each  numeric  unit  is  constructed  from  a  common  set  of  macrocells  to  be 
described  in  detail  later. 

Conversion  Unit  The  conversion  unit  takes,  as  its  input,  a  single  number  represented 
in  some  conventional  form,  such  as  IEEE  double  precision.  Before  the  input  can  be  operated 
on,  it  must  be  check  to  insure  that  it  is  a  legal  number  and  not  an  infinity  or  NaNs  [6]. 
If  the  input  is  not  a  legal  number,  an  error  signal  is  generated  and  the  conversion  process 
aborted.  However,  if  the  input  is  a  legal  number,  the  conversion  process  begins.  To  explain 
the  conversion  process,  an  IEEE  single  precision  number  is  used. 

A  single  precision  floating  point,  number  is  represented  by  a  23  bit  mantissa,  an  8  bit 
exponent,  and  a  single  sign  bit.  The  mantissa  has  an  implied  1  in  front  and  is  expressed  as 
l.XXXXX •••XX  which  can  represent  a  value  in  the  range  (1.0  <  Af  <  2).  To  convert  the 
number  to  SD  representation,  the  range  of  the  mantissa  should  be  [l/r,  1)  which  simplifies 
the  normalization  of  the  SD  number  after  conversion  to,  at  most,  one  left  shift.  Therefore, 
if  the  mantissa  is  shifted  right  one  to  four  bit  places,  it  is  within  the  range  required  for  SD 
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ZO  Z1  Zj-1  Zj 

SIGNED-DIGIT  OUTPUT 


Figure  4.1.  Conversion  Recoding  Hardware  and  Data  Flow. 

conversion.  The  number  of  places  to  shift  the  mantissa  is  determined  by  the  exponent.  The 
exponent  is  expressed  by  8  bits  and  has  a  bias  value  of  +127;  the  range  of  the  un-biased 
exponent  is  —126  to  +127.  The  two  ends  of  the  possible  range  of  the  biased  exponent,  0 
and  255,  are  used  to  represent  0  and  ±inf  which  are  handled  separately.  To  convert  to 
a  radix-16  SD  number,  the  exponent  must  have  the  form  XXXXXXOO.  Therefore,  the 
number  of  right  shifts  to  the  mantissa  is  equal  to  4  minus  the  value  represented  by  the  two 
least  significant  bits  of  the  exponent.  This  always  shifts  the  mantissa  at  least  one  pky:e. 
The  only  time  the  result  will  not  be  within  the  required  range  for  SD  conversion  is  when 
the  mantissa  is  1.000  •  •  -  00  and  the  exponent  is  XXXXXXOO.  This  is  the  only  condition 
where  the  mantissa  requires  zero  shifts.  Once  the  mantissa  is  shifted  and  the  exponent  is 
adjusted  to  reflect  the  right  shifts,  the  SD  conversion  may  occur. 

SD  conversion  is  a  recoding  process  in  which  its  input,  the  shifted  mantissa,  is  split 
into  four-bit  slices  and  recoded  to  adhere  to  the  SD  digit  set.  Dp.  Figure  4.1  shows  the 
conversion  recoding  hardware  and  data  flow.  The  shifted  mantissa  is  input  into  a  recoder, 
5lfl,  and  recoded  such  that  the  output  of  51r  is  X  and  T,  where  X  and  T  are  elements 
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in  the  digit  sets  Dx  and  Dt  respectively  and  whose  value  is  related  to  the  input  by  the 
function 

Bi  =  Xr  +  T. 

The  digit  set  Dx  is  required  to  consist  of  the  elements  {0, 1}.  The  digit  set  Dt  is  determined 
by  the  requirement  that 


When  p  =  f((r  -  l)/2)  +  2],  Tmax  should  equal  [(r  -  l)/2l,  [13].  This  makes  the  digit  set 
Dt  minirnally  redundant  when  used  with  P*. 

Pt  =  {-8,-7, ...,-1,0,1, •••,7, 8} 

The  outputs  of  51r  are  the  inputs  into  the  summer  S2.  This  summer  adds  the  inputs, 
X  and  T,  and  outputs  the  digit  Z  which  is  an  element  of  Dp.  All  digits  are  expressed  in 
binary  twos  complement  format.  The  sign  bit  of  the  floating  point  number  is  used  below 
the  52  level  to  determine  the  correct  representation  of  Z,  either  Z  or  its  2’s  complement. 
A  simple  example  of  the  conversion  process  is  shown  in  Figure  4.2.  A  mantissa  of  12  bits 
and  an  exponent  of  4  bits  are  shown  for  simplicity,  one  sign  bit. 

The  value  of  the  input,  expressed  in  radix-16,  to  the  conversion  unit  is 

-  (0 . 16°  +  3  •  16~‘  -I-  11 . 16'^  +  14 . 16“°)  =  -3  ■  16"^  -  11  •  16~*  -  14  ■  16"°  (4.1) 

The  second  term  on  the  right  hand  side  of  expression  4.1  may  be  re-expressed  as 

-11  •  16"^  =  (5  -  16)  •  16~*  =  5 . 16~*  -  16 . 16"*  =  5  •  16“^  -  1  •  16"\ 

Similarly,  the  third  term  may  be  re-expressed  as 

-14  •  16"°  =  (2  -  16)  .  16~°  =  2 . 16"°  -  16 . 16"°  =  2  •  16"°  -  1  •  16"*. 

The  right  hand  side  of  the  expression  4.1  may  be  re-expressed  as 

-3 . 16"*  +  (5 . 16"*  -  1  •  16"*)  -f.  (2 . 16"°  -  1  •  16"*)  =  -4  •  16"*  -f-  4  •  16"*  -f-  2  ■  16"°. 

This  expression  is  the  same  as  the  final  conversion  results  shown  in  Figure  4.2.  After 
conversion,  the  exponent  is  carried  along  with  the  SD  number  and  used  the  same  as  in 
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INPUT  IN  CONVENTIONAL  FORM 


-1.11011111  EOlOl 


0  -4  4  2  ElO 


OUTPUT  IN  SIGNED-DIGIT  FORM 


Figure  4.2.  Conversion  Recoder  Example. 
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standard  floating  point  arithmetic.  However,  the  exponent  has  two  less  bits  since  the  two 
least  signiflcant  bits  are  dropped  because  they  are  assumed  0.  A  block  diagram  of  the 
conversion  stage  is  shown  in  Figure  4.3.  As  stated  previously,  the  SD  number  out  of  the 
conversion  process  may  require,  at  most,  one  left  shift  to  normalize. 

The  level  of  complexity  of  conversion  is  minimal;  however,  an  additional  stage  in 
the  pipeline  is  required.  This  disadvantage  must  be  offset  by  some  advantage  in  addi¬ 
tion/subtraction,  and  multiplication. 

Adder/Subtractor  Unit  Addition  is  very  similar  to  the  conversion  process  with  only 
minor  exceptions.  The  first  change  is  the  alignment  of  the  exponents.  This  is  simpler 
than  in  standard  representation  since  the  exponents  are  two  bits  shorter  and  the  number 
of  digits  to  shift  are  less  than  the  number  of  bits  to  shift  in  standard  floating  point.  Then, 
instead  of  the  recoder  SIr  having  a  single  input,  51yi  is  a  summer  and  has,  as  its  input, 
two  numbers  in  SD  format.  The  outputs  of  51^  are  X  and  T,  but  the  digit  set  Dx  must 
now  include  a  —1.  The  digit  set  for  T,  0%,  is  unchanged.  The  summer  51>i  performs  the 
function 

Ar^-’  +  Tr-‘  =  INlr’^  +  IN2r-\ 

The  maximum  sum  of  the  inputs  is  defined  by  2p  and  gives  a  maximum  sum  of  20.  This 
range  is  covered  by  the  range  of  Ar  -j-  T.  The  summer  52  is  unchanged  with  the  exception 
of  the  required  —1  in  the  input  digit  set  of  X.  The  normalization  of  a  SD  number  after 
addition  requires,  at  most,  one  right  shift  or  multiple  left  shifts.  Rounding  is  required  if  a 
right  shift  occurs  and  is  discussed  at  the  end  of  the  multiplication  section.  The  complexity 
of  SD  addition  is  of  the  same  order  as  the  conversion  process.  In  comparison  to  standard 
binary  addition,  the  alignment  of  the  exponents  must  still  occur,  though  the  exponents 
are  two  bits  shorter  for  a  SD  number.  Also,  the  maximum  carry  propagation  for  a  number 
expressed  in  SD  form  is  1  digit;  whereas,  a  number  expressed  in  binary  may  require  a 
carry  propagation  across  its  entire  field.  This  is  the  benefit  of  SD  addition  over  standard 
binary  addition.  The  SD  Adders  data  flow  is  shown  in  Figure  4.4  for  four  digit  addition, 
less  exponent  adjust,  normalization  and  rounding. 
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IEEE  STANDARD  754  INPUT 


SIGN  MANTISSA  EXPONENT 


MANTISSA  EXPONENT 

SIGNED-DIGIT  RESULT 


Figure  4.3.  Block  Diagram  of  Conversion  Stage. 
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SIGNED-DIGIT  INPUT 


AO  BO  A1  B1  A2  B2  A3  B3 


ZO  Z1  Z2  Z3 

SIGNED-DIGIT  RESULT 


Figure  4.4.  Data  Flow  in  SD  Adder. 

SD  subtraction  is  essentially  the  same  as  SD  addition  with  the  following  exception. 
Prior  to  the  51^  level,  the  digit  to  be  subtracted  is  2’s  complemented.  The  remainder  of 
the  the  circuit  is  unchanged.  A  block  diagram  of  the  addition/subtraction  unit  is  shown 
in  Figure  4.5. 

Multiplier  Unit  The  SD  Multiplier  computes  all  of  the  partial  products  in  parallel, 
in  its  first  level.  The  next  levels  sum  the  partial  products,  two  at  a  time,  until  a  single 
result  is  obtained.  Then,  the  result  is  normalized,  rounded  and  a  final  result  obtained. 
The  multiplier  stage  used  to  compute  the  partial  products  is  discussed  first  due  to  the 
additional  digit  sets  used  in  the  multiplication  scheme  which  have  not  been  presented  yet. 

A  single  digit  multiplier,  MO,  is  shown  in  Figure  4.6.  The  two  additional  digit  sets 
for  the  multiplier  are  and  from  MO.  The  maximum  values  of  these  digit  sets  are 
determined  by  the  requirement  to  cover  the  maximum  range  of  the  input  product,  and 
the  requirement  of  redundancy  for  the  output.  MO  multiplies  two  digits  in  the  digit  set 
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SIGNED-DIGIT  INPUTS 


MANTISSA  EXPONENTS 


Z 

SIGNED  DIGIT  SUM 


Figure  4.5.  SD  Addition/Subtraction  Unit. 
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SINGLE  DIGIT  BY  SINGLE  DIGIT  INPUT 


FROM  STAGE 
TO  THE  LEFT 


TO  STAGE  ON 
THE  RIGHT 


SINGLE  DIGIT  RESULT 


Figure  4.6.  Single  Digit  by  Single  Digit  Multiplier,  MO 
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Dp  and  outputs  the  result  as 


Ur^-'  +  Wt-'  =  (Br-)  •  (Ar'*). 

To  express  the  product,  in  a  redundant  manner,  the  digit  set  of  must  be,  at  least, 
minimally  redundant.  This  requires  Wmax  >  [(r  —  l)/2l  which  is  the  same  requirement 
for  Tynax  discussed  earlier.  No  benefit  is  achieved  by  having  D^)  more  than  minimally 
redundant  but  there  is  a  cost  in  attempting  to  do  so  as  the  complexity  of  the  entire 
multiply  hardware  increases  as  the  redundancy  increases.  Therefore,  is  established  as 
a  minimally  redundant  digit  set. 

Dyt,  =  {—8,  —7,  •  •  •,  —  1, 0, 1,  •  •  •,  7, 8} 

The  required  digit  set  for  U  can  now  be  established.  Since  the  maximum  absolute  value 
of  the  product  of  \  AB\  is  100,  =  100,  then 

,,  riOO-W^marl  . 

""■““I - 16 - 1=® 

With  these  two  digit  sets,  and  Dypf  the  entire  range  of  the  product  of  A  and  B  is 
representable  with  minimal  redundancy.  The  remaining  digit  sets  in  the  multiplication 
scheme  above,  Dt  and  are  unchanged  from  their  definition  given  earlier,  with  the 
exception  of  Df  not  equal  to  Dt.  This  will  be  explained  later  in  this  section. 

The  digit  sets  used  for  multiplication  are  and  Dx]  the  values  in  each 

digit  set  are 

Dp  =  {-10, -9, -8, -7,-6, -5, -4,-3, -2, -1,0, 1,2,3,4,5,6,7,8,9,10} 

Du,  =  {-8,-7,-6,-5,-4,-3,-2,-l,0,l,2,3,4,5,6,7,8} 

Du  =  {-6,-5,-4,-3,-2,-l,0,l,2,3,4,5,6} 

Dt  =  {-8,-7,-6,-5,-4,-3,-2,-l,0,l,2,3,4,5,6,7,8} 

Dx  =  {-1,0,1} 

In  the  Variable  Precision  Module  presented  by  [13],  the  digit  set  of  T'  is  allowed  to 
be  larger  than  the  digit  set  of  T,  Df.  Df,  may  be  as  large  as  Dt+x-  This  increases  the 
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size  of  Dfi  by  1  on  each  side  of  the  symmetric  set  over  Dt-  Since  the  partial  products  are 
computed  in  full  parallel  and  not  in  serial  or  in  an  array  structure,  the  additional  size  of 
the  digit  set  Df  is  not  required.  However,  to  optimize  later  aspects  of  the  multiplication 
scheme,  specifically  during  the  addition  of  the  partial  products  to  form  the  end  result,  the 
ability  of  inputing  a  T*  in  as  defined  above,  will  prove  useful  at  no  cost. 

In  Figure  4.6,  it  is  important  to  note  the  shifting  of  the  resultant  output  as  compared 
to  the  input.  The  most  significant  digit  output  is  two  digit  places  to  the  left  of  the  most 
significant  input  digit.  This  is  because  of  the  output  of  AfO,  which  outputs  and  the 
outputs  of  5lyi,  which  outputs  Xr^.  Therefore,  the  resultant  output,  Z_2,  is  times  the 
digit  place  of  the  inputs. 

For  a  full  parallel  multiplier  to  multiply  a  complete  SD  number,  B,  by  a  single 
digit,  Ak,  the  single  digit  multiplier  stage  is  duplicated  for  each  digit  in  B.  The  result  of 
replicating  the  stage  is  shown  in  Figure  4.7  and  forms  a  full  parallel  multiplier  block. 

Since  the  computation  of  the  partial  products  occurs  in  parallel,  Wq  and  Tq  are 
always  0.  This  simplifies  the  left  most  stage  of  the  block.  Sl>io  and  52o  are  not  required 
because  the  maximum  value  of  U  out  of  M%  is  Umax  —  6  which,  when  added  to  0  in  5lyio> 
results  in  X  =  0  and  Tmax  =  6.  Therefore,  5lyio  may  be  removed  completely  and  U,  from 
A/Oo,  can  go  directly  to  the  T  input  of  S2i.  52o  is  not  required  because  both  of  its  inputs 
are  always  0.  This  eliminates  51^0j5'2o,  and  the  5_2  output. 

To  multiply  two  SD  numbers,  the  multiplier  block,  shown  in  Figure  4.7,  is  replicated 
so  that  each  digit  in  A  is  used  to  form  a  partial  product  with  the  number  B. 

The  remaining  levels  of  the  multiplier  unit  sum  the  partial  products  after  shifting  the 
products  to  correct  for  the  decimal  point  position  of  Ak.  The  following  discussion  simplifies 
the  summation  levels  by  reducing  the  number  of  digit  adders  required  in  each  level.  What 
must  be  kept  in  mind  is  that  the  inputs  to  the  multiplier  are  normalized  SD  numbers.  This 
is  important  because  significant  savings  in  the  amount  of  hardware  required  to  sum  the 
partial  products  will  result. 

Because  the  inputs  are  normalized,  the  maximum  absolute  value  of  fio  is  1.  If  Bq 
is  1  then  Bi  is  either  0  or  it  has  the  opposite  sign  of  Bq.  This  is  a  requirement  of  a 
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SIGNED-DIGIT  NUMBER 


Figure  4.7.  Single  Digit  by  SD  Number  Multiplier  Block. 
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normalized  number;  if  |Bo|  =  1  then  |flo  +  ^i|  must  be  less  than,  or  equal,  to  1.  Therefore, 
the  maximum  value  of  the  resultant  U  out  of  MOq  is  1;  and,  if  1 171  =  1  then  W  out  of 
MOo  must  be  5  <  <  8  and  have  the  opposite  polarity  of  U.  The  U  out  of  MOi  is  in 

the  range  0  <  (17|  <  6  and  has  the  same  polarity  as  W  out  of  MOq.  This  is  all  with  the 
condition  that  U  out  of  MOo  is  not  0.  Since  W  and  U  into  51/ii  have  the  same  polarity, 
then  5  <  \W  +  £71  <  14  and  the  sum  has  the  opposite  polarity  of  U  out  of  MOq.  The 
resultant  X  out  of  Sl^i  has  the  same  sign  as  (W  +  U).  Therefore,  the  inputs  into  S2i 
are  £7,  from  MOq,  and  X,  from  5l^i,  with  the  constraints  that  l£71  =  1  and  A"  is  =  0 
OT  X  =  — £7.  The  value  of  Z-i  is  the  sum  of  these  inputs  and  is  either  £7,  where  l£7l  =  1, 
or  0  giving  an  \Z^i\max  =  1-  The  next  condition  which  needs  to  be  looked  at  is  when  £7 
out  of  MOq  equals  0.  When  this  is  the  case,  IWl  <7.  If  W  is  any  value  except  0  then 
Bq  =  I  and  the  same  condition  holds  for  Bi  as  above.  The  output  £7  from  MOi  must  be 
0  or  in  the  portion  of  the  digit  set  Z?„  which  has  the  opposite  sign  of  W  from  MOq.  The 
summer  sums  W  from  MQoy\W\  <  7,  and  £7  from  M0i,l£7l  <  6  with  the  constraint 
of  opposite  polarity,  and  outputs  an  X  =  0  and  a  ITl  <  7.  Therefore,  the  inputs  into  52i 
are  both  0  and  the  output  5_i  =  0.  The  last  condition  to  look  at  is  when  Bq  =:  0  and  Bi 
is  any  element  in  Dp.  With  this  condition,  U  and  W  out  of  MOo  are  0  and  £7  out  of  MOj 
may  be  any  element  in  the  set  D^.  The  inputs  into  5lyii  are  W,  from  MOo,  and  £7 ,  from 
MOi.  With  these  inputs,  X  out  of  is  0  and  T  =  U.  Therefore,  the  inputs  to  52i  are 
both  0  so  the  output  Z_i  =  0.  These  are  the  only  possible  combinations  that  can  effect 
Z_i,  therefore,  the  possible  values  of  Z_i  are  {—1,0, 1}.  This  proves  to  be  an  important 
fact  which  reduces  the  amount  of  hardware  required  in  the  partial  product  summers.  It  is 
also  important  to  note  that  Z_i  will  always  equal  0  when  Ak  =  Aq.  The  reason  for  this  is 
as  described  above  when  £7  out  of  MOq  equals  0,  which  is  always  the  case  when  Ak  =  Aq. 

As  stated  previously,  the  summer  levels  of  the  SD  multiplier  form  a  tree  structure 
where  the  number  of  partial  products  half  as  they  proceed  down  the  tree.  Figure  4.8 
shows  this  tree  structure  summing  eight  partial  products.  The  SL2  summer  sums  two 
partial  products,  P„  and  Pn+i,  which  are  shifted  one  digit  position  from  each  other  due 
to  the  position  of  Ak  with  respect  to  each  other.  The  most  significant  digit  of  P„  is,  as 
described  above,  -1,0,  or  1.  Therefore,  when  summing  at  this  level,  the  most  significant 
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PARTIAL  PRODUCT  INPUTS 


Figure  4.8.  Partial  Product  Summer  Structure. 


4-18 


S\a  adder  is  not  required  and  the  digit  may  be  input  directly  into  the  most  significant  52 
adder.  The  least  significant  digit  of  Pn+i  bypasses  the  SL2  summer  completely  since  Pn 
does  not  have  an  input  to  add  with  it.  The  5Z3  summer  sums  the  results  of  SL2  which 
are  shifted  two  digit  positions  from  each  other.  This  is  where  the  digit  set  Dt>  becomes 
a  factor.  If  Dt  is  expanded  to  the  size  of  then,  the  most  significant  digit  of  Pn,n+i 
bypass  the  SLZ  summer  and  the  next  most  significant  digit  may  be  input  directly  into 
an  52  adder.  The  maximum  magnitude  of  this  next  most  significant  digit  is  9  because 
it  is  an  output  from  the  previous  level  where  \T  +  X\max  =  9-  The  least  significant  two 
digits  of  Pn+2,n+3  bypass  the  SL3  summer.  The  SL4  summer  sums  the  results  from  the 
SL3  summers.  These  inputs  are  shifted  four  digit  positions  from  each  other.  The  three 
most  significant  digits  of  Pn,n+i,n+2,n+3  bypass  the  SL4  summer  as  well  as  the  four  least 
significant  digits  of  Pn+4,n+5,n+6,n+7 •  The  forth  most  significant  digit  of  Pn,n+i,n+2,n+3  is 
input  into  the  most  significant  52  Adder.  If  more  summation  levels  are  required  to  sum 
the  partial  products,  this  process  is  continued  until  a  single  result  is  obtained.  Once  this 
single  result  is  computed,  the  result  is  normalized.  The  result  may  require,  at  most,  one 
digit  shift  to  the  right  or  two  digit  shifts  to  the  left  to  normalize  after  multiplication. 


The  last  step  is  to  round  the  result  to  obtain  the  final  output.  The  maximum  round¬ 
off  error  is  less  than  with  simple  truncation,  where  j  is  the  number  of  digits  used 

to  represent  a  normalized  SD  number.  Nearest  rounding  is  easily  accomplished  in  SD 
number  representation.  If  a  SD  number  is  represented  by  J  digits,  0  through  /  —  1,  then, 
nearest  rounding  will  affect  only  the  7—1  digit.  The  maximum  value  of  the  7—1  digit  is 
|7  —  Ijmox  =  |P  -f-  X\max  =  9;  and,  since  rounding  can  affect  the  7  —  1  digit  by,  at  most, 
1,  the  maximum  value  of  7  -  1  after  rounding  is  10,  which  is  in  Dp.  The  maximum  error 
by  nearest  rounding  is 

Error  ^ax  = 

The  IEEE  Standard  754  —  1985  requires  the  intermediate  result  to  be  computed  to  a 
greater  precision  and  then  rounded  to  the  precision  of  its  destination.  Due  to  the  way 
multiplication  is  performed  in  the  full  parallel,  the  least  significant  digits  of  the  partial 
products  which  do  not  effect  the  rounding  procedure  could  be  dropped.  However,  very 
little  hardware  is  saved  by  doing  this  and  it  will  not  conform  to  the  IEEE  standard. 
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Figure  4.9.  SD  Assimilator  Data  Flow. 

Assimilation  Unit  The  fiaal  unit  preforms  the  assimilation  of  a  SD  number  to  stan¬ 
dard  binary,  such  as  IEEE  floating  point.  The  assimilator  is  an  additional  cost  of  using  SD 
number  representation  and  requires  a  separate  stage  in  the  pipeline.  In  fact,  assimilation  is 
the  most  costly  part  of  SD  representation  because  this  is  the  only  operation  with  significant 
carry-barrow  propagation  delays.  The  negative  SD  digits  represent  the  problem.  In  order 
to  convert  the  negative  digits  to  positive,  the  assimilation  stage  performs  the  function 

-r-Vi  +  Ni  =  Zi-Vi+i 

where  Z  is  a  SD  digit  in  Dp,  A  is  a  4-bit  number  in  non-redundant  form,  and  V  is  an 
element  in  {0, 1}  which  represents  a  barrow.  The  assimilator  is  shown  in  Figure  4.9.  The 
barrow  output  from  each  stage  has  the  possibility  of  propagating  left  across  all  of  the 
stages  in  this  level.  The  possible  values  of  Nq  are  0,1,14,  or  15.  If  Nq  is  0  or  1,  then, 
Vq  is  0  which  indicates  that  the  SD  number  assimilated  is  positive.  However,  if  Nq  is  14 
or  15  then,  Vo  is  1,  indicating  the  SD  number  is  negative,  and  the  output  N,  is  given  in 
2’8  complement  form.  A  second  level,  in  the  assimilation  process,  2’s  complements  the 
output  and  a  multiplexer,  controlled  by  Vq,  selects  which  output  to  pass  as  the  result.  The 
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final  levels  normalize  the  result,  adjusts  the  exponent,  and  forms  the  final  result  to  IEEE 
standard.  The  result  may  require,  at  most,  four  left  shifts  to  normalize.  Rounding  is  also 
required  for  the  result  and  is  as  specified  by  the  IEEE  standard.  A  block  diagram  of  the 
assimilation  process  is  shown  in  Figure  4.10.  To  optimize  the  time  required  to  perform 
the  assimilation,  the  2’s  complementor  and  the  multiplexer  should  be  placed  before  the 
assimilator.  To  perform  a  2’8  complement  on  a  SD  number  takes  substantially  less  time 
than  a  standard  binary  number. 
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IEEE  STANDARD  754  NUMBER 


Figure  4.10.  SD  to  IEEE  Assimilator. 


V.  Signed-Digit  Hardware  Modules 


When  representing  a  number  in  SD  form  and  performing  operations  on  it,  unique 
hardware  must  be  designed.  Since  SD  representation  has  great  advantages  over  standard 
binary,  these  advantages  should  be  exploited  in  the  hardware.  The  primary  modules  used 
for  the  SD  operations  presented  in  Chapter  2  are  the  SIr  Recoder,  Slyj  Adder,  S2  Adder, 
MO  Multiplier,  and  the  A1  assimilator.  Each  of  these  are  discussed  as  well  as  their  es¬ 
timated  performance  parameters.  The  performance  parameters  are  obtained  through  the 
use  of  SPICE  analysis.  CIFPLOTs  of  the  51^  Adder,  52  Adder,  and  MO  Multiplier  are 
in  Appendix  B. 

SIr  Recoder 

The  SIr  Recoder  is  the  simplest  of  all  SD  hardware  modules.  It  accepts  a  4-bit 
slice  input  and  outputs  X  and  T  in  the  digit  sets  Dg  and  Dt  respectively.  The  input  is 
expressed  in  binary  non-redundant  form  which  gives  it  a  range  of  number  representation 
from  0  to  15.  The  digit  set  of  X  is  {-1,0, 1}  and  represents  a  radix-16  higher  value  than 
the  least  significant  bit  of  the  input.  The  digit  set  of  T  is  {— 8,  — 7,  •  • -,0, •  •  •,?, 8}  and 
represents  a  value  which  has  the  same  positional  weighting  as  the  least  significant  bit  of 
the  input.  Both  X  and  T  are  represented  in  2’s  complement  form,  as  are  all  numbers  in 
SD  representation.  The  input  is  recoded  by  the  SIr  Recoder  such  that  any  value  of  input 
is  recoded  into  X  and  T  by  the  function 

N  =  Xr  +  T. 

For  all  input  values  in  the  range  (0, 8)  the  value  may  pass  directly  to  T.  However,  if  the 
input  is  in  the  range  (9, 15^,  tne  value  of  X  is  1  while  the  value  of  T  is  16  —  N.  By  analyzing 
the  possible  inputs  and  their  required  results,  a  simple  solution  is  developed.  When  the 
input  is  in  the  range  (0, 7),  its  most  significant  bit  is  0.  When  the  input  is  greater  than  7, 
the  most  significant  bit  is  1.  Therefore,  the  SIr  Recoder  is  designed  without  the  use  of 
any  logic  gates,  it  is  simply  a  routing  problem.  The  input  is  routed  directly  to  T;  however, 
the  input  is  4-bits  wide  while  T  is  b-bits  wide.  For  sign  extension  of  T,  the  most  significant 
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Figure  5.1.  51^  Recoder  Routing. 


bit  of  the  4-bit  input  is  extended  to  be  the  most  significant  bit  of  T.  The  most  significant 
bit  of  the  input  is  also  used  as  the  least  significant  bit  of  X.  Since  X  is  a  2-bit  number 
and  the  input  is  expressed  in  a  non-redundant  form,  X  is  only  0  or  1;  therefore,  the  most 
significant  bit  is  always  0.  Figure  5.1  shows  the  routing  of  the  51;^  Recoder. 

Since  there  is  no  logic  required  for  the  51/j  Recoder,  the  is  no  appreciable  propagation 
delay  through  it.  However,  there  are  important  VLSI  considerations  which  must  be  kept 
in  mind.  The  loading  on  the  most  significant  bit  of  the  input  is  three  times  the  loading 
of  the  other  bits  of  the  input.  When  designing  the  51^  Recoder  for  a  specific  application, 
the  loading  on  the  most  significant  bit  must  be  compensated  for  by  either  using  inverters 


at  the  inputs  and  output  ports  or  by  ensuring  the  driver  for  the  most  significant  bit  is 
scaled  large  enough  for  the  load.  The  use  of  inverters  at  the  input  and  output  ports  give 
the  advantage  of  isolating  the  input  drivers  from  the  load  that  the  outputs  of  51a  sees. 
This  allows  for  the  independent  design  of  the  follow-on  modules  and  scaling  of  the  recoders 
output  drivers  for  those  follow-on  modules.  The  cost  is  the  addition  of  11  inverters. 

5lyi  Adder 

The  51^  adder  accepts,  as  its  inputs,  two  SD  digits  where  each  digit  is  an  element 
of  the  digit  set  Dp.  SD  digits  are  represented  in  2’s  complement  by  5-bits.  The  outputs  of 
the  51yi  Adder  are  X  and  T,  where  X  and  T  are  in  the  digit  sets  Dx  and  Dt  respectively. 
The  first  requirement  of  the  5l>i  Adder  is  to  add  the  inputs,  givi’^g  a  result  which  is  6- bits 
wide.  After  the  inputs  are  added,  the  result  must  be  recoded  into  X  and  T. 

The  adder  must  be  designed  for  inputs  which  are  5-bits  and  a  carry-in  bit.  The 
carry-in  bit  is  connected  to  the  control  logic  and  used  in  conjunction  with  an  inverter 
to  perform  the  subtraction  operation.  By  designing  the  adder  this  way,  it  can  perform 
addition  and  subtraction  faster  due  to  the  short  propagation  delay  through  an  inverter 
compared  to  a  2’s  complementor.  The  next  step  in  the  design  of  the  adder  is  to  select  the 
type  of  adder  to  use  in  order  to  minimize  its  propagation  delay.  The  adder  which  best  suits 
the  needs  of  minimum  propagation  delay  is  a  carry-select  adder.  A  carry-select  adder  is 
used  to  give  rapid  lateral  carry  propagation.  Through  the  use  of  SPICE  simulations,  using 
2/i  technology,  the  estimated  propagation  delay  through  the  worst  case  path  of  the  adder 
is  4.9  ns. 

Recoding  of  the  adders  6-bit  result  is  similar  to  the  recoding  in  the  SIr  Recoder  with 
the  exception  of  the  possibility  of  having  a  negative  value  for  X.  To  perform  the  recoding, 
the  four  least  significant  bits  of  the  adder  results  are  routed  to  the  four  least  significant 
bit  of  T.  The  most  significant  bit  of  T  is  a  sign  extension  of  its  next  most  significant 
bit.  X  is  determined  by  the  two  most  significant  bits  of  the  adders  result  and  the  most 
significant  bit  of  T.  The  two  most  significant  bits  of  the  adders  result  are  added  to  the 
most  significant  bit  of  T  to  form  X.  This  is  done  by  using  two  half  adders  to  compute 
X.  SPICE  simulations  for  this  step  estimates  the  worst  case  propagation  time  is  1.2  ns. 
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The  complete  51^  Adder  is  shown  in  Figure  5.2.  An  estimate  of  the  overall  propagation 
time  of  the  51^  adder  is  6.1  ns.  This  is  the  time  required  to  obtain  the  most  sig^.^ficant 
bit  of  X]  however,  the  time  required  for  T  is  only  the  adders  time,  4.9  ns.  A  CIFPLOT 
of  the  Sl^  Adder  is  in  Appendix  B.  A  transistor  count  of  the  SI4  Adder  shows  that  160 
transistors  are  used. 

52  Adder 

The  52  Adder  is  very  similar  to  the  51^  Adder  with  the  exception  of  the  recoding 
stage  not  required.  The  52  Adder  has  two  inputs,  X  and  T,  or  T',  which  are  in  the 
digit  sets  Dx  and  Dt,  or  Z)t',  respectively.  Therefore,  the  maximum  value  of  their  sum 
^max  +  ^max  =  9  +  1  =  10.  The  addition  is  accomplished  by  using  the  same  carry- 
select  adder  described  in  the  preceding  section  for  the  51^4  Adder.  However,  the  adders 
hardware  is  reduced  by  recognizing  that  the  CARRY-IN  to  the  first  adder  is  always  0. 
This  reduces  the  hardware  of  the  two  least  significant  bit  adders.  Also,  the  hardware  for 
the  most  significant  bit  adder  is  reduced  since  CARRY-OUT  is  not  required.  Figure  5,3 
shows  the  requirements  of  the  52  Adder.  SPICE  simulation  have  shown  that  the  worst 
case  propagation  delay  is  4.9  ns.  The  CIFPLOT  of  the  52  Adder  is  in  Appendix  B.  The 
S2  Adder  requires  129  transistors. 

MO  Multiplier 

The  MO  Multiplier  is  the  most  complex  module  for  SD  arithmetic.  The  multiplier 
has  two  inputs  which  are  both  elements  of  the  digit  set  Dp.  The  results  are  two  values,  U 
and  W  which  are  in  the  digit  sets  and  respectively.  Multiplication  is  accomplished 
by  converting  one  of  the  SD  digits  to  a  modified  radix-4  representation. 

Ai  =  AK\AKi 

In  this  representation,  K  and  K’  are  pseudo-numbers  in  that  they  represent  numbers  in 
the  set  {—2, —1,0, 1,2}  but  they  are  not  coded  in  a  standard  manner.  The  encoder  forms 
K  and  K’  from  A  by  using  the  functions 

A'o  =  {Ai  xor  A4)  and  {A^or  A4) 
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Figure  5.2.  Complete  SIa  Adder. 


K\  =  Aq 
K2  =  (Aoori4i) 

K'o  = 

K[  =  (Aixnor  A2)  or{A3andA4) 
K2  =  {AixorA4)or(A2XorA3) 


K  and  K’  are  coded  such  that  they  can  operate  directly  on  a  set  of  three  multiplexers  each 
where  the  first  multiplexer  selects  the  B  digit  or  its  2’s  complement.  The  second  multiplexer 
is  used  for  selecting  the  output  of  the  preceding  multiplexer  or  shift  that  output  left  one  bit. 
Finally,  the  third  multiplexer  select  whether  to  pass  the  output  of  the  second  multiplexer 
or  to  pass  all  zeros.  K  and  K’  each  operate  on  a  set  of  these  multiplexers.  However,  the  K 
and  K’,  as  well  as  the  outputs  of  the  multiplexer  sets,  are  a  radix-4  apart.  Figure  5.4  shows 
how  the  multiplexer  sets  are  arranged  and  controlled.  The  least  significant  bit  of  K,  and 
K’,  control  the  Complementor  Multiplexer  while  the  next  least  significant  bit  controls  the 
Shifter  Multiplexer.  The  most  significant  bit  controls  the  Zero  Multiplexer.  The  outputs 
of  the  two  multiplexer  sets  form  two  partial  products  which  are  shifted  two  bit  positions 
relative  to  each  other. 

The  partial  products  are  added  by  using  a  6-bit  carry-select  adder,  similar  to  the 
5-bit  version  described  previously.  The  two  least  significant  bits  of  the  multiplexer  set 
controlled  by  K  by-pass  the  adder  since  the  multiplexer  sets  are  radix-4  apart  in  their 
weighting.  The  final  step  is  to  recode  the  results  of  addition  into  the  digit  sets  for  U  and 
W.  W  is  coded  the  same  way  that  T  is  coded  in  the  51^  adder.  The  four  least  significant 
bits  of  the  adder,  where  two  of  the  four  bits  by-passed  the  adder,  are  routed  to  the  four 
least  significant  bits  of  W.  The  most  significant  bit  of  W  is  the  sign  extension  of  its  next 
most  significant  bit.  U  is  coded  the  same  way  that  X  was  coded  except  that  U  is  4-bits 
wide.  Four  half  adders  are  used  to  recode  the  four  most  significant  bits  of  the  6-bit  adders 
result  along  with  the  most  significant  bit  of  W. 

An  overall  diagram  of  the  MO  multiplier  is  shown  in  Figure  5.5.  The  encoder  for  the 
generation  of  K  and  K’  is  shown  as  part  of  the  multiplier.  In  reality,  this  encoder  is  used 
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Figure  5.4.  A/0  Multipliers  Multiplexer  Arrangement. 
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as  a  separate  block  when  a  single  digit  is  being  multiplied  to  a  complete  SD  number.  In 
this  case,  the  single  digit  is  the  input  to  the  encoder  and  the  resulting  K  and  K’  bits  are 
used  for  each  multiplexer  set  corresponding  to  each  digit  in  the  SD  number.  This  reduces 
the  required  hardware. 

The  performance  parameters  obtained  from  SPICE  analysis  are  worst  case  values. 
The  time  to  encode  a  SD  digit  into  K  and  K’  is  2.5  ns.  This  time  is  done  in  parallel  with  the 
formation  of  the  2’s  compler  lent  of  the  multiplexer  digit  and,  in  part,  with  the  multiplexer 
set.  The  total  time  to  obtain  partial  product  results  from  the  multiplexer  set,  including 
the  encoder  time,  is  4.3  ns.  The  addition  of  the  partial  products  requires  5.7  ns  and  the 
recoding  of  its  output  requires  3.7  ns.  However,  a  portion  of  the  recoding  stage  overlaps 
the  adder  stage.  Therefore,  the  partial  product  adder  and  the  recoding  of  its  output  were 
estimated  as  requiring  9.0  ns.  From  the  simulation  results,  the  estimated  time  to  multiply 
two  SD  digits  is  13.3  ns  for  the  formation  of  the  U  result  and  10.0  ns  for  the  W  result.  A 
CIFPLOT  of  the  MO  Multiplier  is  in  Appendix  B.  This  plot  shows  MO  with  the  encoder 
as  an  internal  structure.  In  this  configuration,  the  MO  Multiplier  requires  494  transistors, 
113  of  those  are  for  the  encoder. 

Al  Assimilator 

The  Al  Assimilator  is  the  most  time  consuming  operation  of  all  SD  operations.  This 
is  due  to  the  barrow  propagation  delays  across  the  entire  field.  The  assimilator  accepts 
a  SD  digit,  which  is  expressed  in  a  redundant  form,  and  outputs  a  result  which  is  non- 
redundant.  A  barrow  signal  is  used  to  propagate  negative  values  from  a  digit  which  is 
weighted  r~'  to  the  digit  on  the  left  which  is  weighted  r'“’.  If  the  digit  is  positive,  it  value 
may  be  output  directly.  However,  if  the  digit  is  negative,  the  value  must  be  subtracted  from 
16  and  its  value  output.  The  barrows  are  used  to  decrement  the  output  of  the  stage  on  the 
left,  a  radix  higher.  The  general  configuration  of  the  assimilator  is  shown  in  Figure  5.6. 
The  assimilator  recodes  the  SD  digit  into  a  non-redundant  form,  by  stripping  out  the  four 
least  significant  bits,  and  generates  a  barrow  signal  for  the  next  stage.  Once  the  digit  is 
expressed  in  a  non-redundant  form,  it  is  subtracted  by  the  barrow  from  the  stage  on  the 
right.  The  subtraction  is  accomplished  by  adding  the  barrow,  with  sign  extension,  to  the 
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Figure  5.5.  Complete  AfO  Multiplier  Configuration. 
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Figure  5.6.  Assimilator  for  Signed-Digit  Digit. 


non-redundant  result.  The  adder  is  configured  as  a  2-2  modified  carry-select  adder.  The 
recoding  of  the  digit  is  performed  by  simple  routing  and  requires  negligible  time.  The 
adder  requires  4.5  ns  to  compute  the  final  result. 


VI.  Signed-Digit  Performance 


In  the  preceding  chapters,  the  SD  operation  units,  and  the  modules  with  which  the 
units  are  built,  were  described.  Performance  estimates  for  the  modules  were  given  in  terms 
of  propagation  delays  through  each  unit.  By  using  these  estimates  of  module  performance, 
the  SD  modules  can  accurately  be  described  in  VHDL.  Once  the  modules  are  described, 
SD  units  can  be  modeled  and  simulated. 

Signed-Digit  Module  Descriptions 

The  Sip  Recoder  accepts  a  4- bit  input  and  provides  the  outputs  T  and  X  which  are 
in  the  digit  sets  and  Dx  respectively.  The  VHDL  description  of  the  entity  interface  is 
defined  by  these  signals. 

use  work. SD.DEFINITIONS. all; 
entity  SI .RECODER  is 

port  (  DATA.IN  :  in  bit.vectorC  3  downto  0  ) ; 

T.out  :  out  T.TYPE; 

X.out  ;  out  X.TYPE  ) ; 

end  SI .RECODER : 

The  DATA  JN  signal  describes  the  4-bit  input  which  is  a  4-bit  slice  of  the  total  input 
mantissa.  T.TYPE  and  X.TYPE  are  data  types  which  describe  subtypes  of  a  bit.vector 
where  T.TYPE  is  a  bit.vector  (  4  downto  0  )  and  X.TYPE  is  a  bit.vector  (  1  downto 
0  ).  These  subtypes  are  used  to  clarify  the  data  types  by  giving  them  unique  names 
corresponding  to  the  uigit  sets  which  they  represent.  All  of  the  data  types  are  defined  in  the 
package  SD  JIEFINITIONS.  The  Sip  Recoder  is  described  behaviorally  and  only  involves 
proper  routing  of  the  input  signals  to  the  correct  output  lines.  No  generic  parameters 
are  passed  to  the  recoder  since  there  is  no  requirement  for  altering  the  propagation  delay, 
which  is  essentially  0  ns.  The  complete  VHDL  description  of  the  51/?  Recoder  is  given  in 
Appendix  C. 
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The  51>i  Adder  is  more  involved  than  the  recoder.  It  accepts  two  SD  digits  in  the 
digit  set  Dp  and  outputs  T  and  X  in  the  digit  sets  Dt  and  Dx  respectively.  The  51^ 
Adder  also  requires  a  control  signal  which  Indicates  if  it  is  performing  and  addition  or  a 
subtraction.  The  VHDL  entity  description  defines  these  inputs. 

use  work. SD.DEFIlfITIONS. all; 
entity  Sl.ADOER  is 


generic  (  TECHNOLOGY.SCALE 

:  real  :■  1.0  ) ; 

port  (  SDl.in 

:  in  SD.DIGIT; 

SD2_in 

:  in  SD.DIGIT; 

ADD.SUB 

:  in  bit; 

X.out 

:  out  X.TYPE; 

T.out 

:  out  T.TYPE  ); 

end  SI. ADDER; 

The  data  type  SDJDIGIT  is  defined  as  a  bit.vector  (  4  downto  0  )  in  the  pack¬ 
age  SDJDEFINITIONS.  X.TYPE  and  T.TYPE  are  as  defined  previously  while  bit  is 
a  predefined  type.  The  generic  parameter  TECHNOLOGY..SCALE  is  used  to  linearly 
alter  the  propagation  delay  through  the  adder.  The  default  propagation  delay,  TECH- 
NOLOGY.SCALE  equal  to  1.0,  is  determined  through  SPICE  analysis  using  2  micron 
technology.  If  a  different  technology  is  used,  the  propagation  delay  is  changed  by  setting 
TECHNOLOGY.SCALE  to  linearly  adjust  for  the  new  technology.  The  architectural  de¬ 
scription  of  the  adder  is  a  behavioral  description.  This  description  converts  the  SD  digits  to 
integer  values,  adds  the  integers,  and  converts  the  result  into  and  X  vector  and  a  T  value. 
The  T  value  is  then  converted  to  a  T  vector.  Two  functions  are  used  in  this  behavioral 
description,  BIN.TOJNT  and  INT.TOJ5D.  These  functions  are  defined  in  the  package 
SDJIEFINITIONS  and  called  when  required.  The  complete  VHDL  description  is  given  in 
Appendix  C. 

The  S2  Adder  accepts  an  X  vector  and  a  T  vector,  which  are  defined  by  the  data 
types  X.TYPE  and  T.TYPE  respectively.  The  output  is  a  SD  digit  defined  by  the  data 
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type  SD_DIGIT.  The  S2  Adder  does  not  require  any  control  signals.  The  entity  description 
defines  these  inputs  and  the  output. 

use  work. SO.DEFINITIONS. all 
entity  S2_AD0ER  is 

generic  (  TECHNOLOGY.SCALE  :  real  :■  1.0  ); 

port  (  X_in  ;  in  X_TYPE; 

T_in  :  in  T.TYPE; 

SD_out  :  out  SD_DIGIT  ) ; 

end  S2_ADDER; 

The  architectural  description  of  the  S2  Adder  is  a  behavioral  description.  TJn  is 
converted  to  an  integer  and  incremented,  decremented,  or  un-altered  depending  on  the  bit 
fields  of  XJn.  The  result  is  then  converted  to  a  bit  vector,  SD-out,  defined  by  the  data 
type  SD_DIGIT.  TECHNOLOGY JSCALE  is  used  as  discussed  previously.  The  complete 
VHDL  description  for  the  S2  Adder  is  given  in  Appendix  C. 

The  MO  Multiplier  multiplies  two  SD  digits  and  outputs,  as  its  result,  U  and  W 
which  are  in  the  digit  set  and  Dyj  respectively.  There  are  no  control  signals  required 
for  the  multiplier.  The  inputs  and  outputs  are  defined  in  the  entity  description. 

use  work. SD.DEFINITIONS. all; 
entity  M0_MULT  is 

generic  (  TECHNOLOGY.SCALE  :  real  :*  1.0  ); 


port 

(  SD.l 

:  in  SD.DIGIT; 

SD.2 

:  in  SD.DIGIT; 

U.out 

;  out  U.TYPE; 

H.out 

;  out  W.TYPE  ); 

end  MO.MULT; 

The  data  types  U_TYPE  and  W_TYPE  are  bit  vectors  which  are  defined  in  the 
package  SD_DEFINITIONS.  The  architectural  description  of  the  multiplier  is  behavioral. 
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The  two  SD  digits  are  converted  to  integers  and  multiplied.  The  result  is  then  converted 
to  a  U  vector  and  a  W  value,  where  the  W  value  is  then  converted  to  a  W  vector  through 
the  function  call  INT.TOJSD.  The  complete  VHDL  description  of  the  MO  Multiplier  is 
given  in  Appendix  C. 

Once  the  VHDL  descriptions  of  the  SD  modules  were  completed,  each  module  was 
tested.  The  tests  were  designed  to  validated  the  correctness  of  each  module  before  instan¬ 
tiating  them  in  larger  models.  SlJlECODER_TB,  S1_ADDER-TB,  S2_ADDER_TB,  and 
M0_MULT_TB  test  benches  were  written,  analyzed,  simulated,  and  reports  generated  to 
verify  correctness.  These  test  benches  and  their  report  generators  are  given  in  Appendix  C. 
Simulation  results  are  also  presented  in  Appendix  C. 

Complete  SD  Multiplier 

A  SD  number  which  corresponds  to  the  precision  of  IEEE  double  precision  requires 
the  number  to  consist  of  16  digits,  0  to  15.  This  provides  a  precision  of  16“*®  =  2"®®. 
Therefore,  to  multiply  two  SD  numbers,  16  multiplier  blocks  with  16  digit  multipliers  in 
each  block  are  required.  This  will  result  in  16  partial  products.  The  partial  products 
are  added  in  a  tree  structure  with  four  levels  until  a  single  result  is  obtained.  To  build 
a  VHDL  model  of  the  multiplier,  several  sub-components  were  built.  A  multiplier  block 
which  multiplies  a  single  digit  to  a  SD  number  was  built.  This  block  consists  of  16  MO 
Multipliers,  15  514  Adders,  and  15  S2  Adders.  Since  the  5I4  Adders  are  only  used  for 
addition  in  a  multiplier,  the  ADD_SUB  control  signals  are  set  to  ADD.  The  result  out 
of  the  block  is  a  partial  product  which  is  17  digits  long.  The  entity  description  of  the 
multiplier  block  defines  the  inputs. 

use  work. SD.DEFINITIONS. all; 
entity  MULT.BLOCK  is 

generic  (  TECHNOLOGY.SCALE  :  real  ;=  1.0  ); 

port  (  DIGIT.C  :  in  SD.DIGIT; 

SD.NUMB  :  in  SD.NUMBER; 

RESULT  :  out  PARTIAL.?  (  0  to  16  )  ); 
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end  MULT.BLOCK; 


The  data  type  SD_NUMBER  is  defined  in  SD_DEFINITIONS  as  an  array  (  0  to  15  ) 
of  SD  JDIGIT  while  PARTIAL.?  is  defined  as  an  unbounded  array  of  type  SD_DIGIT.  The 
distinction  between  the  two  is  made  to  identify  a  SD  number  as  a  distinct  type  apart  from 
any  partial  product  types.  The  generic  parameter  is  not  directly  used  in  the  architecture 
but  is  passed  down  to  the  lower  modules.  A  structural  description  of  the  multiplier  block 
instantiates  all  of  the  modules  required  individually.  The  complete  VHDL  description  is 
given  in  Appendix  C. 

The  next  sub-component  written  is  ADDER_1.  ADDER_1  is  an  adder  composed  of  a 
single  S1.ADDER  and  an  S2JkDDER.  This  component  was  written  to  reduce  the  number 
of  component  instantiation  statements  in  the  partial  adder  sub-components.  ADDER_1 
requires  two  SD  JDIGIT  inputs,  a  TJn  input,  and  outputs  a  SDJDIGIT  and  a  Tjout.  The 
entity  description  defines  its  required  signals. 


use  work. SD.DEFINITIONS. all; 
entity  ADDER. 1  is 

generic  (  TECHNOLOGY.SCALE 
port  (  SDl 
SD2 
T.in 
T.out 
SUMr 


real  1.0  ) ; 
in  SD.DIGIT; 
in  SD.DIGIT; 
in  T.TYPE; 
out  T.TYPE; 
out  SD.DIGIT  ); 


end  ADDER. 1; 


The  architectural  description  is  structural  and  instantiates  one  SlJ^DDER  and  one 
S2.ADDER.  An  X  vector  is  declared  within  the  architecture  and  provides  the  path  between 
the  adders  for  this  signal.  TECHNOLOGYJiCALE  is  passed  down  to  the  adder  modules. 
A  complete  VHDL  description  is  given  in  Appendix  C. 

Four  levels  of  partial  product  adders  were  modeled,  SL2.ADDER,  SL3_ADDER, 
SL4_ADDER,  and  SL5.ADDER.  Each  of  these  adders  requires  the  same  number  of 
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ADDER.!  components,  16,  but  there  interface  signals  are  different.  SL2.ADDER  ac¬ 
cepts  partial  products  from  the  MULT3LOCK  and  sums  them.  The  result  is  a  partial 
product  which  is  18  digits  long,  0  to  17.  The  SL3.ADDER  then  adds  two  of  these  results 
and  outputs  a  partial  product  20  digits  long,  0  to  19.  SL4.ADDER  adds  two  of  these 
results  and  outputs  a  partial  product  24  digits  long,  0  to  23.  Finally,  SL5.ADDER  adds 
the  two  partial  products  from  SL4.ADDER  and  outputs  the  final  partial  product  which  is 
32  digits  long,  0  to  31.  The  entity  descriptions  for  the  partial  product  adders  define  there 
signals. 

use  work. SD.DEFIlfITIONS. all; 
entity  SL2_ ADDER  is 

generic  (  TECHNOLOGY.SCALE  :  real  :»  1.0  ); 

port  (  PARTIAL.H  :  in  PARTIAL.?  (  0  to  16  ) ; 

PARTIAL.!  ;  in  PARTIAL.?  (  0  to  16  ) ; 

P.out  :  out  PARTIAL.?  (  0  to  17  )  ) ; 

end  SL2. ADDER; 

use  work. SD.DEFINITIONS. all; 
entity  SL3.ADDER  is 

generic  (  TECHNOLOGY.SCALE  :  real  :=  1.0  ); 

port  (  PARTIAL.H  :  in  PARTIAL.?  (  0  to  17  ) ; 

PARTIAL.!  :  in  PARTIAL.?  (  0  to  17  ) ; 

P.out  :  out  PARTIAL.?  (  0  to  19  )  ) ; 

end  SL3. ADDER; 

use  work. SD.DEFINITIONS. all; 
entity  SL4. ADDER  is 

generic  (  TECHNOLOGY.SCALE  :  real  :=  1.0  ); 

port  (  PARTIAL.H  :  in  PARTIAL.?  (  0  to  19  ) ; 

PARTIAL.!  :  in  PARTIAL.?  (  0  to  19  ) ; 

P.out  :  out  PARTIAL.?  (  0  to  23  )  ); 
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end  SL4. ADDER; 


use  work. SD.DEFINITIONS. all; 
entity  SLS.ADDER  is 

generic  (  TECHNOLOGY.SCALE  :  real  :■  1.0  ); 

port  (  PARTIAL.H  :  in  PARTIAL.?  (  0  to  23  ) ; 

PARTIAL.!  :  in  PARTIAL.?  (  0  to  23  ) ; 

P.out  :  out  PARTIAL.?  (  0  to  31  )  ) ; 

end  SLS.ADDER; 

Complete  VHDL  descriptions  for  the  partial  product  adders  are  given  in  Appendix  C. 

From  these  components,  a  SD  multiplier  which  multiplies  the  mantissas  of  two  SD 
numbers,  corresponding  to  a  precision  greater  than  IEEE  double  precision,  can  be  built. 
The  mantissa  multiplier,  SD_MULT,  accepts  two  SD  numbers  of  type  SDJ^UMBER,  and 
outputs  a  result  which  is  of  type  PARTIAL-?  with  a  range  0  to  31.  The  entity  description 
of  SD-MULT  defines  the  multipliers  signals. 

use  work. SD.DEFINITIONS. all; 
entity  SD.MULT  is 

generic  (  TECHNOLOGY.SCALE  :  real  :=  1.0  ); 

port  (  SD.A  :  in  SD .NUMBER; 

SD.B  :  in  SD.NUMBER; 

SD.out  :  out  PARTIAL.?  (  0  to  31  )  ) ; 

end  SD.MULT; 

The  result,  SDjout,  is  shifted  to  the  right  one  digit  due  to  the  multiply  algorithm 
discussed  in  Chapter  4.  The  architectural  description  of  the  multiplier  is  structural  and 
instantiates  the  components  MULT_BLOCK,  SL2..ADDER,  SL3.ADDER,  SL4_ADDER, 
and  SL5_ADDER.  MULT-BLOCK  is  instantiated  16  times  while  SL2-ADDER  is  instan¬ 
tiated  8  times.  SL3-ADDER  is  instantiated  4  times;  and,  SL4.ADDER  is  instantiated 
2  times.  SL5-ADDER  is  instantiated  only  once.  The  generic  parameter  is  passed  down 
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through  each  instantiation.  The  complete  VHDL  description  of  SD_MULT  is  given  in 
Appendix  C. 

Testing  of  the  Signed-Digit  Multiplier 

Testing  of  the  multiplier  consists  of  writing  a  test  bench  which  instantiates  the  mul¬ 
tiplier  and  mapping  test  vectors  to  its  inputs.  Then,  the  result  is  analyzed  after  the  report 
is  generated.  The  instantiation  of  the  multiplier  is  a  single  instantiation  of  SD_MULT. 
However,  to  generate  a  set  of  test  vectors  becomes  complex.  This  is  due  to  the  require¬ 
ments  of  the  digit  set  of  a  SD  digit.  To  work  around  this  problem,  a  test  bench  package 
was  developed,  TB_PACKAGE.  Within  the  package,  two  functions  are  used  to  easy  the 
generation  of  test  vectors  and  result  analysis.  The  function  SD_MAKE  is  passed  a  real 
number  and  returns  a  normalized  SD  number  while  the  function  SD_TO_REAL  is  passed  a 
SD  number  and  returns  its  real  number  equivalent.  Care  must  be  used  when  calling  these 
functions.  When  SD_MAKE  is  called  and  passed  a  number  which  is  not  in  the  range  of 
a  normalized  SD  number,  the  result  returned  will  not  have  the  same  value  as  that  passed 
but  will  be  some  factor  of  16  of  the  argument.  The  function  SD.TO  JlEAL  assumes  that 
the  most  significant  digit  is  weighted  with  a  1.  When  being  passed  the  16  most  significant 
digits  of  the  multipliers  result,  this  is  not  true.  Therefore,  the  value  returned  is  a  factor 
of  16  smaller  than  the  actual  result.  However,  by  passing  the  function  P_DUt(l  to  16),  the 
value  returned  is  correct.  The  test  bench  SD_MULT_TB  is  given  in  Appendix  C. 

Once  the  test  bench  was  analyzed,  model  generated,  and  built,  the  model  was  sim¬ 
ulated.  Various  reports  were  generated  from  the  simulation.  The  correctness  of  the  test 
bench  package  functions  were  analyzed  first.  Once  the  correctness  of  the  functions  verified, 
the  propagation  delay  of  the  multiplier  was  analyzed.  These  propagation  delays  assume 
that  the  inputs  have  already  been  converted  to  SD  form  and  that  the  mantissa  section  of 
the  multiplier  requires  more  time  than  the  addition  of  the  exponents,  a  reasonable  assump¬ 
tion.  When  using  the  default  TECHNOLOGY-SCALE,  indicating  2  micron  technology, 
the  worst  case  propagation  delay  is  65  ns.  If  the  technology  is  changed  to  1.25  micron, 
the  TECHNOLOGY-SCALE  factor  is  change  to  roughly  approximate  the  speed-up  asso¬ 
ciated  with  the  change  in  technology.  Linear  scaling  gives  the  approximate  speed-up  of 


6-8 


2,  implying  that  TECHNOLOGY-SCALE  equals  0.5.  Using  this  scaling  factor,  the  worst 
case  propagation  delay  is  32  ns.  The  report  generators  and  the  reports  are  given  in  Ap¬ 
pendix  C.  On3  note  regarding  the  report  generated  is  that  the  VHDL  report  generator 
has  a  problem  reporting  negative  real  numbers.  This  is  a  problem  with  the  VHDL  report 
generator,  Intermetrics  Version  1.5  running  on  the  Suns. 
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VIL  Conclusions  and  Recommendations 


Conclusions 

The  original  motivation  behind  the  study  into  developing  a  processor  to  compute 
transcendental  functions  was  driven  by  the  requirements  of  solving  the  Vector  Wave  Equa¬ 
tion.  Mickey  Bailey,  [1],  expanded  the  set  of  transcendental  functions  to  encompass  a 
greater  number  of  functions  than  required.  These  functions  all  were  derived  from  Cheby- 
shev  Polynomials.  With  the  development  of  the  division  algorithm,  together  with  the 
expanded  trigonometric,  exponential,  and  natural  logarithm  functions  to  give  IEEE  dou¬ 
ble  precision  accuracy,  an  extensive  Transcendental  Function  Processor  can  be  developed. 

Chapter  2  and  Chapter  3  developed  the  approximation  algorithms  and  the  rational  for 
their  use.  The  fewest  number  of  terms  to  achieve  an  error  below  a  specified  value  was  used 
as  the  determining  factor  in  the  selection  of  the  best  approximation  method.  This  section 
of  the  thesis  covered  important  information  which  did  not  appear  in  the  original  effort.  The 
structure  of  the  approximations  algorithms  are  based  on  Horners’  method  of  restructuring 
a  polynomial  such  that  its  computational  form  is  suitable  for  a  pipelined  processor.  The 
pre-processing,  pipeline  processing,  and  post-processing  requirements  of  a  unified  processor 
were  discussed.  However,  the  structure  of  a  unified  Transcendental  Function  Processor  did 
not  evolve.  The  reasons  for  this  are  that  the  pre-processor  requires  different  operations 
performed  on  the  arguments  of  different  functions.  Therefore,  the  pre-processor  can  be 
optimized  by  knowing  the  mix  of  the  functions  requested.  The  more  complex  the  mix  of 
the  requests,  the  more  complex  the  control  section  of  the  pre-processor  must  be.  Post¬ 
processing  has  the  same  complexity  problem;  if  an  complex  control  section  for  the  post¬ 
processor  is  designed,  the  through-put  of  the  processor  can  remain  high.  However,  if  the 
control  section  is  simple,  the  processor  will  have  to  have  dummy  stages  inserted  into  the 
post-processing  stages  to  synchronize  data  for  return  to  memory  or  further  processing.  The 
pipeline  processing  section  is  the  best  developed  section.  The  pipeline  consists  of  a  data 
pipeline,  an  argument  pipeline,  and  a  control  pipeline.  This  permits  rapid  reconfiguration 
of  the  pipeline  to  compute  the  approximation  functions  in  any  order,  without  delays  in  the 
arguments  into  the  pipeline. 
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Chapter  4  presented  an  overview  on  alternate  forms  of  data  representation  for  use 
in  the  processor.  The  most  interesting  and  advanced  form  is  Signed-Digit  representation. 
SD  representation  offers  a  number  of  advantages  when  compared  to  standard  binary  rep¬ 
resentation.  The  greatest  advantage  is  the  reduction  of  carry-barrow  propagation  delays. 
This  increases  the  computation  speeds  possible  from  adders  and  multipliers.  However, 
the  advantages  of  SD  representation  do  have  a  cost  associated  with  its  use;  this  is  the 
penalty  of  converting  IEEE  double  precision  numbers  to,  and  from,  SD  form.  The  penalty 
of  the  conversion  operation  to  SD  form  is  minor  due  to  its  limited  carry  propagation.  The 
assimilation  penalty  is  more  sever  since  there  exist  the  possibility  of  having  a  barrow  prop¬ 
agate  across  the  entire  mantissa.  However,  in  a  pipelined  processor  environment,  these 
conversions  need  only  occur  once. 

Chapter  5  expands  of  the  hardware  required  for  numbers  represented  in  SD  form. 
The  basic  module  were  presented  as  well  as  their  performance  estimates  obtained  from 
SPICE  models  with  LAMBDA  equal  to  1.0  microns.  The  Sl_RECODER  does  not  have 
any  propagation  delay  since  it  consists  of  only  routing  of  the  input  bits  to  their  proper 
output.  The  Sl_ADDERs  T  output  has  a  propagation  delay  of  4.9  ns  while  the  X  output 
requires  6.1  ns.  The  S2_ADDER  requires  4.9  ns  to  propagate  the  input  to  the  output.  The 
M0_MULTs  propagation  delay  is  10  ns  for  the  W  output  and  13.3  ns  for  the  U  output. 
Each  of  the  modules  were  built  in  VLSI  and  presented  in  Appendix  B. 

In  Chapter  6,  the  basic  modules  were  describe  in  VIIDL  and  each  simulated  to  ensure 
their  function  and  propagation  times  agree  with  the  times  obtained  from  the  SPICE  simu¬ 
lation.  Then,  a  16  digit  by  16  digit  multiplier  was  constructed  and  simulated.  Simulation 
estimates  the  worst  case  propagation  delay  of  the  SD  mantissa  multiplier  as  65  ns  when 
using  2.0  micron  technology,  excluding  conversion  and  assimilation  time.  This  propagation 
time  drops  to  32  ns  when  the  technology  is  changed  to  1.25  micron.  The  additional  time 
required  for  only  the  conversion  of  the  mantissa  is  the  propagation  time  of  the  S2  Adder, 
4.9  ns.  Assimilation  of  the  mantissa  is  dependent  of  the  construction  of  the  Assimi!’  ion 
Unit.  The  simulation  results,  as  well  as  the  VHDL  descriptions  of  the  hardware,  were 
iresented  and  shown  in  Appendix  C.  The  speed  of  the  SD  hardware  is  comparable  to  a 
step  in  technology  size  when  compared  to  standard  methods  of  computation. 


Recommendations 


The  Transcendental  Function  Processor  requires  further  investigations  into  the  trade¬ 
offs  between  control  complexity  and  throughput  for  its  pre  and  post  processors.  This  will 
rely  heavily  of  the  type  and  frequency  of  functions  to  be  computed.  However,  the  dedica¬ 
tion  of  hardware  of  any  form  to  the  processor  is  still  premature.  Further  work  is  required 
into  the  realizable  advantages  of  SD  representation.  A  tiny  chip  was  constructed  a  part  of 
this  thesis  effort  and  is  shown  in  Appendix  B.  This  chip  needs  to  be  fabricated  and  tested 
with  results  compared  to  those  expected  from  a  VHDL  model.  If  the  results  show  that  SD 
representation  does  provide  an  appreciable  speed-up  then,  a  full  SD  multiplier  should  be 
built  and  tested.  Though  this  thesis  did  not  consider  the  size  requirement  of  SD  hardware, 
this  must  be  studied  when  considering  its  use  in  the  Transcends .  tal  Function  Processor. 
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Appendix  A.  Determination  of  Chebyshev  Constants 
The  evalua,tion  of  the  integral 

2  C 

an  =  —  I  /{cos  x)  cos  nxdx 
If  Jo 

is  not  simple  for  most  functions,  f{x).  However,  the  accuracy  of  the  summed  Chebyshev 
Polynomials  is  dependent  on  the  accuracy  of  these  constants.  To  obtain  a  resultant  ac¬ 
curacy  of  double  precision,  the  precision  of  these  constants  is  required  to  be  greater  than 
double  precision.  Therefore,  for  those  function  in  which  the  integral  can  be  evaluated,  the 
accuracy  of  the  result  can  easily  be  reached.  For  functions  where  the  integral  can  not  be 
evaluated  directly,  the  result  must  be  approximated  by  using  an  integral  approximation 
method  such  as  Simpson’s  Rule.  Using  these  types  of  approximation  methods  requires  a 
great  deal  of  care.  The  limiting  factor  in  making  these  approximations  is  the  precision  of 
the  computer  used.  If  the  computer  only  has  the  ability  to  compute  up  to  double  precision 
accuracy,  then,  the  resultant  error  will  be  somewhat  greater  depending  on  the  distribu¬ 
tion  of  the  truncation  errors  in  the  computation.  For  all  of  the  coefficients  used  in  the 
Transcendental  Function  Processor,  the  error  term  of  the  coefficients  is  required  to  be  less 
than  2“^.  This  is  due  to  the  internal  precision  ability  of  the  processor  when  numbers  are 
represented  in  Signed-Digit  form. 

Additional  problems  appear  when  trying  to  approximate  to  the  required  accuracy 
of  the  coefficients.  The  shape  of  the  graph  of  the  integrand  must  be  considered.  If  the 
integrand  has  the  shape  of  a  negative  parabola,  then  the  approximation  must  begin  with  the 
outer  edges  where  the  magnitude  is  the  smallest  and  sum  towards  the  center.  The  opposite 
is  true  if  the  shape  is  similar  to  a  positive  parabola.  Virtually  all  of  the  transcendental 
functions  of  interest  exhibit  one  of  these  shapes.  The  important  point  to  remember  is  the 
smallest  magnitude  of  the  curve  must  be  summed  first.  Also,  when  trying  to  approximate 
using  a  method  such  as  Simpsons  Rule,  to  obtain  the  required  precision,  the  number  of 
intervals  required  to  be  summed  is  quite  large.  However,  if  the  programs  are  written 
carefully  and  the  library  routines  validated  for  accuracy,  a  method  which  computes  the 
area  under  the  curves  by  summing  intervals  can  be  used. 
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As  stated  previously,  there  are  ways  to  solve  for  the  integration.  One  such  method 
involves  Residue  Analysis.  As  an  example  of  how  this  analysis  works,  the  coefficients  for 
f(x)  =  sin(zx/2)  will  be  solved.  Therefore,  the  equation  for  the  coefficients  is 


2  ,  /  T  cos  X  \  , 

an  =  —  I  sin  ! -  cos  nx  ax 

Z  Jo  \  2  J 


The  limits  of  integration  are  changed  by  recc^nizing  that  the  integrand  is  an  odd  function. 
The  result  is  a  circular  interval  of  integration. 


\  .  f  z  cos  X  \  , 

fln  =  —  /  sin  ( -  cos  nx  ax 

z  \  2  ) 


The  first  step  in  Residue  Analysis  is  to  generate  a  series  in  the  complex  plane  to  represent 
the  integrand.  Euler’s  Equation  is  used  to  do  this  conversion. 


cosx  = 


and 


If 


then 


cos  nx  = 


+  e' 


Rearranging 

Therefore,  the  integral  is 


e’*  =  Z 


ie'^dx  =  dZ  . 


,  dZ 


+  \  dZ 

2  )  iZ 

where  C  is  the  unit  circle  centered  at  the  origin  transversed  in  the  counter  clockwise 
direction.  To  perform  simple  Residue  Analysis,  there  should  only  be  one  unique  pole  in 
the  unit  radius  around  zero,  which  is  the  case  here.  Therefore, 

«„  =  Resz=Q  {sin 
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To  derive  a  series  from  the  above  equation,  the  trigonometric  series  for  Sine  is  used. 

z®  x"' 


oo 


2k+l 


Solving  in  steps, 


And, 


(Z  +  =  ^2fc+i  ^  ^2jfc  +  +  (2Jk)(2Jt  +  + 


or 


Therefore, 


sin 


( Z  +  (  (2fc  +  1)!  \  „2j-2fc-i 

/’'■/-y  ,  '7-1'! 'l  __  f  {2k 1)!  \  rr2i-2k-\ 

(^^(Z+Z  )j-L  (2^1)! 


The  coefficients  equal 


^  ~  (-!)*=( /  (2A:  +  1)!  \  „2j-2it+n-2  ,  72j-2k-n-2 

"  (2fc  +  l)!  lo)!(2*+l-i)!j^^ 


) 


In  Residue  Analysis,  when  looking  for  the  first  integration  of  a  series  whose  pole  is  at  Z 
=  0,  the  integration  value  is  obtained  from  the  coefficient  of  Z~^.  Therefore,  from  the 
equation  above,  the  value  of  j  which  will  give  a  power  of  —1  to  Z  must  be  solved. 

2j  —  2fc  +  n  —  2  =  - 1 


and 


2 j  —  2fc  -  n  -  2  =  - 1  . 
Therefore,  from  the  first  equation. 


j  =  k- 


n  —  1 
2 
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and  from  the  second  equation, 


j  =  k  + 


n+  1 
2 


Using  these  values  for  j  and  solving, 


a„  =  2  f;  (-1)*' 

k=(n-l)/7 


This  infinite  series  is  evaluated  by  summing  to  a  finite  number.  Since  the  denominator  of 
the  series  is  a  factorial,  the  number  of  terms  required  to  be  summed  to  obtain  the  needed 
precision  is  small,  on  the  order  of  30  terms.  To  maintain  precision,  the  summation  must 
occur  in  reverse  order;  that  is,  the  sum  should  be  computed  from  A;  =  30  down  to  0  when 
computing  oi. 
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Appendix  B.  Signed-Digit  CIFPLOTS 
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Figure  B.l.  CIFPLOT  of  51^  Adder. 
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Figure  B.2.  CIFPLOT  of  S2  Adder. 
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Figure  B.3.  CIFPLOT  of  MO  Multiplier. 
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Appendix  C.  Signed-Digit  VHDL  Descriptions 


package  SD_ DEFINITIONS  is 

subtype  SD.DIGIT  is  bit_vector(  4  doimto  0  ) ; 

type  SD_NUMBER  is  array  (  0  to  IS  )  of  SD.DIGIT; 

type  PARTIAL.?  is  array  (  integer  range  <>  )  of  SD.DIGIT; 

subtype  X.TYPE  is  bit_vector(  1  dotmto  0  ) ; 

subtype  T.TYPE  is  bit_vector(  4  doimto  0  ) ; 

subtype  U.TYPE  is  bit .vector (  3  doimto  0  ) ; 

subtype  W.TYPE  is  bit.vectorC  4  doento  0  ) ; 

type  T. ARRAY  is  array  (  integer  range  <>)  of  T.TYPE; 

type  X_ ARRAY  is  array  (  integer  range  <>)  of  X.TYPE; 

type  U_ ARRAY  is  array  (  integer  range  <>)  of  U.TYPE; 

type  W_ ARRAY  is  array  (  integer  range  <>)  of  W.TYPE; 

function  U.TO.SD  (  U_ value  :  U.TYPE  )  return  SD.DIGIT; 
function  U.TO.T  (  U_ value  :  U.TYPE  )  return  T.TYPE; 
function  BIN.TQ.INT  (  IN.VECT  :  bit.vector  )  return  INTEGER; 
function  INT.TO.SD  (  INT.VAL  :  integer  )  return  SD.DIGIT; 

end  SD.DEFINITIONS; 

package  body  SD.DEFINITIONS  is 

function  BIN.TO.INT  (  IN.VECT  :  bit.vector  )  return  INTEGER  is 
vziriable  vect.high,  int.val,  scale  :  integer; 
begin 

int.val  :=  0; 
scale  :=  1; 

for  i  in  0  to  (  IN.VECT ’high  -  1  )  loop 
if  (  IN.VECT (i)  =  ’1’  )  then 
int.val  :=  int.val  +  scale; 
end  if; 

scale  :s  scale4‘2; 
end  loop; 

vect.high  ;=  IN.VECT’high; 

if  (  IN.VECT(vect.high)  *  ’1’  )  then 
int.val  :*  int.val  -  scale; 
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end  if; 


return  (  int.val  ) ; 
end  BIN.TO.INT; 


function  INT_T0_SD  (  INT.VAL  :  integer  )  return  SD.DIGIT  is 

vauriable  int.vect  :  SD.DIGIT; 
variable  range.ck,  temp  :  integer; 

begin 

if  (  INT.VAL  <  0  )  then 
int.vect (4)  := 
temp  :=  16  +  INT.VAL; 
else 

int.vect (4)  :=  'O’; 
temp  :=  INT.VAL; 
end  if; 

ran^i!  ck  :»  8; 

for  i  in  3  downto  0  loop 

if  (  iemp  >=  range.ck  )  then 
int_vect(i)  ;*  *1'; 
temp  :*  temp  -  range.ck; 
else 

int.vect(i)  *0’; 
end  if; 

range.ck  :=  range_ck/2; 
end  loop; 

return  (  int.vect  ) ; 
end  INT.TO.SD; 


function  U.TO.SD  (  U.value  :  U.TYPE  )  return  SD.DIGIT  is 
variable  SD. value  :  SD.DIGIT; 
begin 


C-2 


SD_value(0)  ;*  U_value(0); 
SD_value(l)  :*  U_value(l); 
SD_value(2)  :=  U_value(2); 
SD_value(3)  :»  U_value(3); 
SD_valuo(4)  :*  U_valua(3); 

return  (  SD.value  ) ; 

end  U_TO_SD; 


function  U_TO_T  (  U_ value  :  U.TYPE  )  return  T.TYPE  is 
variable  T_value  :  T.TYPE; 
begin 

for  I  in  0  to  3  loop 

T.valueCl)  :=U_value(I); 
end  loop; 

T_value(4)  :=  U_value(3) ; 
return  (  T_ value) ; 
end  U.TO.T; 

end  SD.DEFINITIONS; 
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U8«  work. SD.DEFINITIONS. all; 
entity  SI .RECODER  is 

port  (  DATA.IH  :  in  bit.vector  (  3  doimto  0  ) ; 
X.out  :  out  X.TYPE; 

T.out  ;  out  T.TYPE) ; 

end  SI. RECODER; 


use  work. SD.DEFINITIONS. all; 
architecture  Structural  of  Sl.RECODER  is 

begin 

T.out(O)  <»  DATA.IN(O); 

T.out(l)  <*  DATA.IN(l); 

T.out(2)  <=  DATA.IN(2); 

T.out(3)  <*  DATA.IN(3); 

T.out(4)  <*  DATA.IN(3); 

X.out(O)  <=  DATA.IN(3); 

X.outd)  <«  'O’ ; 

end  Structural; 
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use  work. SD.DEFINITIONS. all; 
entity  Sl.ADDER  is 


generic  (  TECHNOLOGY.SCALE  :  real  :« 


port  (  SDl.in 
SD2.in 
ADD.SUB 
X_out 
T_out 


:  in  SD.DIGIT; 
:  in  SD.DIGIT; 
:  in  bit; 

:  out  X.TYPE; 

:  out  T.TYPE); 


1.0  ); 


end  Sl.ADDER; 


use  work. SD.DEFIlfITIQNS. all; 
architecture  Behavioral  of  Sl.ADDER  is 

begin 

process 

variable  SDl.val,  SD2_val,  SUM  :  integer; 
variable  X.temp  :  bit.vector  (  1  dosnto  0  ) 

begin 

wait  on  SDl.in,  SD2.in,  ADD.SUB; 

SDl.val  :*  BIN.T0.INT(  SDl.in  ); 

SD2.val  :=  BIN.TO.INTC  SD2.in  ); 

if  (  ADD.SUB  =  '0'  )  then 
SUM  :=  SDl.val  +  SD2.val; 
else 

SUM  :*  SDl.val  +  SD2.val  +  1; 
end  if; 

if  (  SUM  >=  8  )  then 
SUM  :»  SUM  -  16; 

X.temp (0)  :=  ’1’; 

X.temp(l)  :*  ’O’; 
els if  (  SUM  <®  -8  )  then 
SUM  :*  SUM  +  16; 

X.temp (0)  :*  ’!’; 

X.tempd)  :■  ’!’; 
else 
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X.tempCO)  :*  *0'; 

X.temp(l)  :•  ’O’; 
end  if; 

X_out  <«  X.temp  after  (  TECHNOLOGY.SCALE  *  6.1  ns); 

T.out  <*  IHT_T0,SD(  SUM  )  after  (  TECHNOLOGY.SCALE  *  4.9  ns) ; 

end  process; 

end  Behavioral; 
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use  work. SD.DEFINITIONS. all; 
entity  S2_ADDER  is 


generic  (  TECHNOLOGY.SCALE  :  real  :*  1.0  ); 
port  (  X_in  :  in  X.TYPE; 

T_in  :  in  T.TYPE; 

SD.out  :  out  SD.DIGIT); 


end  S2_ADDER; 


use  work. SD.DEFINITIONS. all; 
architecture  Behavioral  of  S2_ADDER  is 

begin 

process 

variable  T.VAL,  X_VAL,  SUM  :  integer; 
begin 

wait  on  X_in,  T_in; 

T.VAL  :*  BIN_T0.INT(  T.in  ); 

X.VAL  :»  BIN.TO.INTC  X.in  ); 

SUM  ;*  T.VAL  +  X.VAL; 

SD.out  <=  INT.TO.SDC  SUM  )  after  (  TECHNOLOGY.SCALE  *  4.9  ns) 
end  process; 
end  Behavioral; 
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use  work. SD.DEFIMITIOHS. all; 
entity  MO.HULT  is 


generic  (  TECHNOLOGY.SCALE  :  real 


port  (  A.DIGIT 
B.DIGIT 
W.OUT 
U.OUT 


:  in  SD.DIGIT; 
:  in  SD.DIGIT; 
:  out  H.TYPE; 

:  out  U.TYPE); 


1.0 


); 


end  MO.MULT; 

use  work. SD.DEFINITIONS. all; 
architecture  Behavioral  of  MO.MULT  ia 

begin 

process 

variable  A.val,  B.val,  PROD,  U_val  :  integer 
variable  long.U  :  bit.vector  (  4  downto  0  ) ; 

begin 

wait  on  A.DIGIT,  B.DIGIT; 

A. val  :*  BIN.TO.INTC  A.DIGIT  ); 

B. val  :=  BIN.TO.INTC  B.DIGIT  ); 

PROD  :*  A.val*B.val; 

U.val  :®  0; 

if  (  PROD  >=  0  )  then 
for  i  in  1  to  6  loop 
if  (  PROD  >*  8  )  then 
PROD  :=  PROD  -  16; 

U.val  :=  U.val  +  1; 
end  if; 
end  loop; 
else 

for  i  in  1  to  6  loop 

if  (  PROD  <=  -8  )  then 
PROD  :s  PROD  +  16; 

U.val  :=  U.val  -  1; 
end  if; 
end  loop; 
end  if ; 
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long.U  IHT.TO.SDC  U.val  ); 

U.OUT(O)  <»  long_U(0)  after  (  TECHNOLOGY.SCALE  *  13.3  ns); 

U.OUT(l)  <-  long.Ud)  after  (  TECHNOLOGY.SCALE  *  13.3  ns); 

U_0UT(2)  <«  long_U(2)  after  (  TECHNOLOGY.SCALE  ♦  13.3  ns); 

U_0UT(3)  <*  long.U(3)  after  (  TECHNOLOGY.SCALE  *  13.3  ns); 

W.OUT  <»  INT.TO.SDC  PROD  )  after  (  TECHNOLOGY.SCALE  *  9.6  ns); 

end  process; 

end  Behavioral; 
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use  work. SD.DEFIHITIONS. all; 
entity  CONVERSION.TB  is 
end  CONVERSION.TB; 


use  work. SD.DEFINITIONS. all; 
architecture  TEST.CQ  of  CONVERSION.TB  is 

component  SI .RECODER 

port  (  DATA.IN  :  in  bit.vector  (  3  downto  0  ) ; 
X.out  :  out  X.TYPE; 

T.out  ;  out  T.TYPE); 

end  component; 

component  S2_A0DER 

generic  <  TECHNOLOGY.SCALE  :  real  :®  1.0  ); 
port  (  X.in  :  in  X.TYPE; 

T.in  :  in  T.TYPE; 

SD.out  :  out  SD.DIGIT) ; 

end  component; 

for  all  :  Sl.RECODER  use  entity  work.Sl_RECODER(Structural) ; 
for  all  :  S2.ADDER  use  entity  work.S2.ADDER(Behavioral) ; 

signal  SLICED,  SLICEl  :  bit.vector  (  3  downto  0  ); 

signal  X.l,  XO  :  X.TYPE; 

signal  TO  :  T.TYPE; 

signal  SDO,  SOI  :  SD.DIGIT; 

begin 

SIR  :  Sl.RECODER 

port  map  (  DATA.IN  *>  SLICED, 

X.out  =>  X.l, 

T.out  a>  TO); 

S2R  :  Sl.RECODER 

port  map  (  DATA.IN  =>  SLICEl, 

X.out  »>  XO, 

T.out  =>  SDl); 

S2A  :  S2.ADDER 

port  map  (  T.in  =>  TO, 

X.in  »>  XO, 

SD.out  »>  SDO) ; 
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SLICEl 

<»  "0001" 

after 

20  nst 

"0010" 

after  40  ns, 

"0100" 

after 

60  ns. 

"0110" 

after  80  ns, 

"1000" 

after 

100 

ns. 

"1010" 

after 

120 

ns 

"1100" 

after 

140 

ns. 

"1110" 

after 

160 

ns 

"1111" 

after 

180 

ns. 

"1110" 

after 

220 

ns 

"1100" 

after 

240 

ns. 

"1010" 

after 

260 

ns 

"1000" 

after 

280 

ns. 

"0110" 

after 

300 

ns 

"0100" 

after 

320 

ns. 

"0010" 

after 

340 

ns 

"0001" 

after 

360 

ns. 

"0000" 

after 

380 

ns 

SLICEO 

<■  "0001" 

after 

200 

ns; 

end  TEST.CO; 
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SD  Conv«rsion  module  report" 


Vhdl  Simulation  Report 

Report  Name:  SO  Conversion  module  report" 
Kernel  Library  Name:  <<RPETERSO>>TEST_CO 
Kernel  Creation  Date:  NAR-31-1989 
Kernel  Creation  Time:  15:37:49 
Run  Identifer:  1 
Run  Date:  MAR-3 1-1989 
Run  Time:  15:37:49 

Report  Control  Language  File:  conversion.report .rcl 
Report  Output  File  :  conversion.report .rpt 

Max  Time:  9223372036854775807 
Max  Delta:  2147483646 

Report  Control  Language  : 

Simulation.report  CONVERSION.report  is 
begin 

Report .name  is  "SD  Conversion  module  report"; 

Page.width  is  80; 

Page.length  is  50; 

Signal.format  is  horizontal; 

Sample.signals  by .event  in  ns; 

Select. signal  :  SLICEO; 

Select.signal  ;  SLICEl: 

Select. signal  :  SDO; 

Select. signal  :  SDl; 

end  CONVERSION.report; 

Report  Format  Information  : 

Time  is  in  NS  relative  to  the  start  of  simulation 
Time  period  for  report  is  from  0  NS  to  End  of  Simulation 
Signal  values  are  reported  by  event  (  '  ’  indicates  no  event  ) 
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MAR-31-1989  15:41:14 


PAGE  2 


VHDL  Report  Generator 
SO  ConeersioQ  module  report" 


TIME 

- SIGNAL  NAMES - - 

. 1 

1 

SLICEO 

SLICEl 

SDO 

SDl 

(NS) 

(3  DOWNTO  0) 

(3  DOWNTO  0) 

(4  DOWNTO  0) 

(4  DOWNTO  0) 

0 

1 

1 

"0000" 

"0000" 

"00000" 

"00000" 

20 

1 

"0001" 

+1 

1 

"00001" 

40 

1 

"0010" 

+1 

1 

"00010" 

60 

1 

"0100" 

♦  1 

1 

"00100" 

80 

1 

"0110" 

+1 

1 

"00110" 

100 

1 

"1000" 

+1 

1 

"11000" 

104* 

1 

"00001" 

120 

1 

"1010" 

♦  1 

1 

"11010" 

140 

1 

"1100" 

+1 

1 

"11100" 

160 

1 

"1110" 

♦1 

1 

"11110" 

180 

1 

"1111" 

+1 

1 

"11111" 

200 

1 

"0001" 

204f 

1 

"00010" 

220 

1 

"1110" 

♦  1 

1 

"11110" 

240 

1 

"1100" 

♦  1 

1 

6 

o 

eH 

260 

1 

"1010" 

+1 

1 

"11010" 

280 

1 

"1000" 

+1 

1 

"11000" 

300 

1 

"0110" 

+1 

1 

"00110" 

304* 

1 

"00001" 

320 

1 

"0100" 

+  1 

1 

"00100" 

340 

1 

"0010" 

+1 

1 

"00010" 
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use  Bork.isD.DEFINITIONS.all; 
entity  aODER.TB  is 
end  jkODER.TB; 


use  work. SD.DEFIMITIONS. all; 
architecture  TEST.ADDER  of  ADDER.TB  is 

component  Sl.AODER 

generic  (  TECHNOLOGY 
port  (  SDl.in 
SD2.in 
ADD.SUB 
X_out 
T.out 

end  component; 

component  S2_A0DER 

generic  (  TECHNOLOGY 
port  (  X_in 
T.in 
SD.out 

end  component; 

for  all  :  Sl.ADDER  use  entity  work.Sl.ADDER(Behavioral) ; 

for  all  :  S2_ADDER  use  entity  work.S2.ADDER(Behavioral) ; 

signal  SDO,  SDl,  SD2,  SDA,  SDB,  SDOO,  SDOl,  SD02  :  SD.DIGIT; 

signal  XO,  XI  :  X.TYPE; 

signal  T1  :  T.TYPE; 

signal  ADD.CNTL  :  bit; 

begin 

SIA  :  Sl.ADDER 


port  map 

( 

SDl. in 

*> 

SDl. 

SD2.in 

*> 

SDA, 

ADD.SUB 

a> 

ADD.CNTL 

X.out 

*> 

XO, 

T.out 

«> 

Tl); 

Sl.ADDER 

port  map 

( 

SDl. in 

*> 

SD2, 

SD2.in 

*> 

SOB. 

ADD.SUB 

*> 

ADD.CNTL 

.SCALE  :  real  :«  1.0  ); 
:  in  SD.DIGIT; 

:  in  SD.DIGIT; 

:  in  bit: 

:  out  X.TYPE; 

:  out  T.TYPE); 


.SCALE  :  real  : =  1.0  ) ; 
:  in  X.TYPE; 

;  in  T.TYPE; 

:  out  SD.DIGIT); 
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X_out 

■> 

XI. 

T_out 

»> 

SD02) 

S2A  :  S2_ADDER 

port  map 

(  T.in 

■> 

SDO, 

X,in 

»> 

XO. 

SD.out 

=> 

SDOO) 

S2B  :  S2_ADDER 

port  map 

(  T_in 

*> 

Tl, 

X_in 

*> 

XI, 

SD_out 

=> 

SDOl) 

SDO  <=  "00000" ; 
ADD.CNTL  <=  *0’; 


<*  "00100" 

after 

25  ns. 

"01000" 

after  ! 

50  ns. 

"00000" 

after 

75  ns. 

"11100" 

after 

100  ns, 

"11000" 

after 

125 

ns , 

"10110" 

after 

150 

ns 

"01010" 

after 

175 

ns. 

"00000" 

after 

200 

ns 

"00100" 

after 

225 

ns. 

"01000" 

after 

250 

ns 

"00000" 

after 

275 

ns. 

"11100" 

after 

300 

ns 

"11000" 

after 

325 

ns. 

"10110" 

after 

350 

ns 

"01010" 

after 

375 

ns; 

SDA  <*  "00011",  "11101"  after  200  ns; 

SD2  <*  SDA; 

SDB  <a  SDl; 

end  TEST, ADDER; 
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MAR-31-1989  15:39:25  VHDL  Report  Generator 

SO  Adder  module  report" 

Vhdl  Simulation  Report 

Report  Marne:  SD  Adder  module  report" 
Kernel  Library  Name:  «RPETERSO>>TEST_ADDER 
Kernel  Creation  Date:  HAR-31-1989 
Kernel  Creation  Time:  15:38:42 
Run  Identifer:  1 
Run  Date:  NAR-3 1-1989 
Run  Time:  15:38:42 

Report  Control  Language  File:  adder .report .rcl 
Report  Output  File  :  adder.report.rpt 

Max  Tima:  9223372036854775807 
Max  Delta:  2147483646 


Report  Control  Language  : 

Simulation.report  ADDER.report  ia 
begin 

Report.name  is  "SD  Adder  module  report"; 

Page.vidth  is  80; 

Page. length  is  50; 

Signal. format  is  vertical; 

Sample. signals  by.event  in  ns; 

Select. signal  :  SDl; 

Select. signal  :  SD2; 

Select. signal  :  SDA; 

Select. signal  :  SDB; 

Select. signal  :  SDOO; 

Select. signal  :  SDOl; 

Select. signal  :  SD02; 

end  ADDER.report; 

Report  Format  Information  : 

Time  is  in  NS  relative  to  the  start  of  simulation 

Time  period  for  report  is  from  0  NS  to  End  of  Simulation 

Signal  values  are  reported  by  event  (  *  ’  indicates  no  event 


PAGE  1 


C-16 
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SO  Adder  nodule  report" 


TIME  1- 
1 

-SIGNAL  NA 

HES . 

. 1 

(NS)  1 

S 

s 

s 

S 

s 

s 

s 

1 

D 

D 

D 

D 

D 

D 

D 

1 

1 

2 

A 

B 

0 

0 
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( 

( 

0 

1 

2 

1 

4 

4 

4 
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( 

1 

4 

4 

4 

1 

D 

D 

D 

D 

1 

0 

0 

0 

0 

D 

D 

D 

1 

W 

W 

W 

W 

0 

0 

0 

1 

M 

N 

N 

N 

W 

W 

W 

1 

T 

T 

T 

T 

N 

N 

N 

1 

0 

0 

0 

0 

T 

T 

T 

1 

0 

0 

0 

1 

0 

0 

0 

0 

1 

) 

) 

) 

) 

0 

0 

0 

1 

1 

) 

) 

) 

1 

0  1 

"00000" 

"00000" 

"00000" 

"00000" 

"00000" 

"00000" 

"00000 

+1  1 

"00011" 

+2  1 

"00011" 

4*  1 

"00011 

9*  1 

"00011" 

25  1 

"00100" 

♦1  1 

"00100" 

29*  1 

"00111 

34*  1 

"00111" 

50  1 

"01000" 

+1  1 

"01000" 

54*  1 

"11011 

59*  1 

"11111" 

61  1 

"00001" 

"11100" 

75  1 

"00000" 

♦  1  1 

"00000" 

79*  1 

"00011 

84*  1 

"00100" 

86  1 

"00000" 

"00011" 

100  1 

"11100" 

♦1  1 

"11100" 

104*  1 

"11111 

109*  1 

"11111" 
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MAR-31- 1989  15:39:25  VHDL  R«port  Gensrator  PAGE  3 

SO  Addar  modula  report" 

TIME  I - SIGNAL  NAMES . I 

1 

(NS)  I  S  S  S  S  S  S  S 

I  D  D  0  D  D  D  D 

I  1  2  A  B  0  0  0 

I  (  (  (  (  0  1  2 

14  4  4  4  (  (  ( 

I  4  4  4 

I  D  D  D  D 

I  0  0  0  0  D  D  D 

I  W  W  W  W  0  0  0 

I  N  N  N  N  N  W  W 

I  T  T  T  T  N  N  N 

I  0  0  0  0  T  T  T 

I  0  0  0 

I  0  0  0  0 

I  )  )  )  )  0  0  0 

I  )  )  ) 

I 

125  I  "11000" 

+1  I  "11000" 

129*  I  "11011" 

134*  I  "11011" 

150  I  "10110" 

♦1  I  "10110" 

154*  1  "11001" 

159*  I  "11001" 

175  I  "01010" 

+1  I  "01010" 

179*  I  "11101" 

184*  I  "11101" 

186  I  "00001"  "11110" 

200  I  "00000"  "11101" 

+1  I  "11101"  "00000" 

211  I  "00000"  "11101" 

225  I  "00100" 

♦1  I  "00100" 

229*  I  "00001" 

234*  I  "00001" 

250  I  "01000" 
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SD  Adder  module  report" 


PAGE  4 


TIME 

(MS) 


I 

I 


S 

D 

1 

( 

4 

0 

0 

w 

N 

T 

Q 

0 

) 


S 

D 

2 

( 

4 

D 

0 

W 

N 

T 

0 

0 

) 


SIGNAL  NAMES 


S  S  S 

D  D  D 

ABO 

(  (  0 

4  4  ( 

4 

D  D 

ODD 
W  W  0 

N  N  W 

T  T  N 

0  0  T 

0 

0  0 

)  )  0 

) 


S 

D 

0 

1 

( 

4 

D 

0 

W 

N 

T 

0 

0 

) 


S 

D 

0 

2 

( 

4 

D 

0 

W 

N 

T 

0 

0 

) 


♦  1 
254* 
259* 
275 
♦  1 
279* 
284* 
300 
♦1 
304* 
309* 
325 
♦1 
329* 
334* 
336 
350 
♦1 
354* 
359* 


"00000" 


"11100" 


"11000" 


"10110" 


"01000" 


”00000" 


"11100" 


"11000" 


"10110" 


"00101" 

"00101" 


"11101" 

"11101" 


"11001" 

"11001" 


"00101" 

"00101" 

"11111"  "00100" 

"00011" 

"00010" 
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use  work. SD.DEFINITIONS. all 
entity  MO.TB  is 
end  MO.TB: 


use  uork.SD.DEFINITIONS.all; 
architecture  TEST.MO  of  MO.TB  is 

component  MO.MULT 

generic  (  TECHNOLOGY.SCALE  :  real  :«  1.0  ); 
port  (  A.DIGIT  ;  in  SD.DIGIT; 

B.DIGIT  :  in  SD.DIGIT; 

W.OOT  :  out  M.TYPE: 

U.OUT  :  out  U.TYPE); 

end  component: 

for  all  :  MO.MULT  use  entity  work.MO.MULT(Behavioral) ; 

signal  A.DIGIT,  B.DIGIT  ;  SD.DIGIT; 
signal  W.out  :  W.TYPE; 
signal  U.out  ;  U.TYPE; 

begin 

MOO  :  MO.MULT 

port  map  (  A.DIGIT  *>  A.DIGIT, 

B.DIGIT  =>  B.DIGIT, 

W.OUT  »>  W.out, 

U.OUT  »>  U.out); 

A. DIGIT  <»  "01010"  after  50  ns,  "10110"  after  100  ns, 

"00000"  after  160  ns,  "01010"  after  200  ns, 

"10110"  after  250  ns,  "00000"  after  300  ns, 

"01010"  after  350  ns,  "10110"  after  400  ns, 

"00000"  after  450  ns,  "01010"  after  500  ns, 

"10110"  after  550  ns; 

B. DIGIT  <»  "00001"  after  150  ns,  "01010"  after  300  ns, 

"10110"  after  460  ns; 

end  TEST.MO; 


C-20 


APR- 13- 1989  12:26:02 


PAGE  1 


VHDL  Report  Generator 
NO  Multiplier  module  report" 

Vhdl  Simulation  Report 

Report  Name:  HO  Multiplier  module  report" 
Kernel  Library  Name:  «PETERSON»TEST_MO 
Kernel  Creation  Date:  APR- 12- 1989 
Kernel  Creation  Time:  10:48:07 
Run  Identifer:  1 
Run  Date:  APR- 12- 1989 
Run  Time:  10:48:07 

Report  Control  Language  File:  mO.report . rcl 
Report  Output  File  :  mO.report .rpt 

Max  Time:  9223372036854775807 
Max  Delta:  2147483646 

Report  Control  Language  : 

Simulation.report  MO.report  is 
begin 

Report.name  is  "MO  Multiplier  module  report"; 

Page. width  is  80; 

Page.length  is  50; 

Signal.format  is  vertical; 

Sample.signals  by.event  in  ns; 

Select.signal  :  A.DIGIT; 

Select.signal  :  B.DIGIT; 

Select.signal  :  U.out; 

Select.signal  :  H.out; 

end  MO.report; 

Report  Format  Information  : 

Time  is  in  NS  relative  to  the  start  of  simulation 

Time  period  for  report  is  from  0  NS  to  End  of  Simulation 

Signal  values  are  reported  by  event  (  ’  *  indicates  no  event  ) 
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NO  Multiplier  module  report" 


TIME  1  — 
1 

- SIGNAL  NAMES 

— . 1 

(NS)  1 
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u 
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1 
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0 

0 

0 

0 

1 

) 

) 

1 

0 

0 

1 

1 

) 

) 

1 

0  1 

"00000" 

"00000" 

"0000" 

"00000" 

50  1 

"01010" 

100  1 

"10110" 

150  1 

"00000" 

"00001" 

200  1 

"01010" 

209*  1 

"11010" 

213*  1 

"0001" 

250  1 

"10110" 

259*  1 

"00110" 

263*  1 

"1111" 

300  1 

"00000" 

"01010" 

309*  1 

"00000" 

313*  1 

"0000" 

350  1 

"01010" 

359*  1 

"00100" 

363*  1 

"0110" 

400  1 

"10110" 

409*  1 

"11100" 

413*  1 

"1010" 
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MO  Multiplier  nodule  report" 


TIME 

(NS) 


450 

459* 

463* 

500 

509* 

513* 

550 

559* 

563* 


A 

D 

I 

G 

I 

T 

( 

4 

D 

Q 

W 

N 

T 

0 

0 

) 

"00000" 


"01010" 


"10110" 


SIGNAL  NAMES 


B 

D 

I 

G 

I 

T 

( 

4 

D 

0 

V 

N 

T 

0 

0 

) 

"10110" 


u 

0 

u 

T 
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3 

D 

0 

U 

N 

T 

0 

0 

) 


"0000" 


"1010" 


"0110" 
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W 

0 

U 

T 

( 

4 

D 

0 

W 

N 

T 

0 

0 

) 


"00000" 


"11100" 


•00100" 


C-23 


use  work. SD.DEFINITIONS. all; 
entity  MULT.BLOCK  is 

generic  (  TECHNOLOGY.SCALE  ;  real  :«  1.0  ); 
port  (  DIGIT.C  :  in  SD.DIGIT; 

SD.NUMB  :  in  SD.HUMBER; 

RESULT  :  out  PARTIAL.?  (  0  to  16)); 

end  MULT.BLOCK; 


use  work. SD.DEFIMITIONS. all; 
architecture  Structural  of  MULT.BLOCK  is 

component  MO.MULT 

generic  (  TECHNOLOGY.SCALE  :  real  :«  1.0  ); 
port  (  A.DIGIT  :  in  SD.DIGIT; 

B.DIGIT  :  in  SD.DIGIT; 

W. OUT  :  out  H.TYPE; 

U.OUT  :  out  U.TYPE); 

end  component; 

component  Sl.ADDER 

generic  (  TECHNOLOGY.SCALE  :  real  :»  1.0  ); 
port  (  SDl.in  :  in  SD.DIGIT; 

SD2.in  :  in  SD.DIGIT; 

ADD.SUB  :  in  bit; 

X. out  :  out  X.TYPE; 

T.out  :  out  T.TYPE) ; 

end  component; 

component  S2. ADDER 

generic  (  TECHNOLOGY.SCALE  :  real  :=  1.0  ); 
port  (  X.in  :  in  X.TYPE; 

T.in  :  in  T.TYPE; 

SD.out  :  out  SD.DIGIT) ; 

end  component; 


for  all  :  MO.MULT  use  entity  work.MO.MULT(Behavioral) ; 
for  all  :  Sl.ADDER  use  entity  work.Sl_ADDER(Behavioral) ; 
for  all  :  S2.ADDER  use  entity  work.S2.ADDER(Behavioral) ; 

signal  W.ARR  :  H.ARRAY  (  0  to  15  ) ; 
signal  U.ARR  :  U.ARRAY  (  0  to  15  ) ; 
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signal  UDIG  :  PARTIAL_P(  0  to  IS  ) ; 

signal  X.ARR  :  X.ARRAY  (  0  to  14  ); 

signal  T.ARR  :  T. ARRAY  (  0  to  14  ) ; 

signal  ADD.CNTL  :  bit; 

begin 

MOO  :  MO.MULT 

generic  map  (  TECHHOLOGY.SCALE  »>  TECHHOLOGY.SCALE  ) 
port  map  (  A.DIGIT  *>  DIGIT.C, 

B.DIGIT  *>  SD.SUMBCO), 

W.OUT  »>  W.ARR(O), 

U.OUT  ->  U.ARR(O)); 

MOl  :  MO.MULT 

generic  map  (  TECHNQLOGY.SCALE  ■>  TECHNOLOGY.SCALE  ) 
port  map  (  A.DIGIT  =>  DIGIT.C, 

B.DIGIT  *>  SD.KUMB(l), 

W.OUT  *>  W.ARR(l). 

U.OUT  «>  U.ARR(l)); 

M02  :  MO.MULT 

generic  map  (  TECHNOLOGY.SCALE  =>  TECHNOLOGY.SCALE  ) 
port  map  (  A.DIGIT  *>  DIGIT.C, 

B.DIGIT  *>  SD.NUMB(2), 

W.OUT  ->  W.ARR(2), 

U.OUT  =>  U.ARR(2)); 

M03  :  MO.MULT 

generic  map  (  TECHNOLOGY.SCALE  =>  TECHNOLOGY.SCALE  ) 
port  map  (  A.DIGIT  =>  DIGIT.C, 

B.DIGIT  =>  SD.NUMBO), 

W.OUT  »>  W.ARRO), 

U.OUT  »>  U.ARRO)); 

M04  :  MO.MULT 

generic  map  (  TECHNOLOGY.SCALE  =>  TECHNOLOGY.SCALE  ) 
port  map  (  A.DIGIT  *>  DIGIT.C, 

B.DIGIT  *>  SD.NUMB(4), 

W.OUT  ->  W.ARR(4), 

U.OUT  «>  U.ARR(4)); 

M05  :  MO.MULT 

generic  map  (  TECHNOLOGY.SCALE  *>  TECHNOLOGY.SCALE  ) 
port  map  (  A.DIGIT  =>  DIGIT.C, 
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B.DIGIT  ->  SD_NUMB(6), 

W.OUT  »>  H_ARR(5), 

U.OUT  «>  U_ARR(5)); 

M06  :  MO.MULT 

generic  map  <  TECHMOLOGY.SCALE  »>  TECHMOLOGY.SCALE  ) 
port  map  (  A.DIGIT  ■>  DIGIT_C, 

B.DIGIT  «>  SD_NUMB(6), 

W.OUT  »>  W.ARR(6), 

U.OUT  ->  U.A11R(6)); 

MOT  ;  MO.MULT 

generic  map  (  TECHMOLOGY.SCALE  «>  TECHMOLOGY.SCALE  ) 
port  map  (  A.DIGIT  =>  DIGIT.C, 

B.DIGIT  «>  SD.MUMB(7), 

W.OUT  «>  W.ARRCT), 

U.OUT  «>  U.ARRCT)); 

M08  :  MO.MULT 

generic  map  (  TECHMOLOGY.SCALE  *>  TECHMOLOGY.SCALE  ) 
port  map  (  A.DIGIT  »>  DIGIT.C, 

B.DIGIT  «>  SD.MUMB(8), 

W.OUT  a>  W.ARR(8), 

U.OUT  »>  U.AIIR(8)): 

M09  :  MO.MULT 

generic  map  (  TECHMOLOGY.SCALE  »>  TECHMOLOGY.SCALE  ) 
port  map  (  A.DIGIT  «>  DIGIT.C, 

B.DIGIT  *>  SD.MUMBO), 

W.OUT  »>  W.ARRO)  , 

U.OUT  »>  U.ARRO)); 

MIO  :  MO.MULT 

generic  map  (  TECHMOLOGY.SCALE  »>  TECHMOLOGY.SCALE  ) 
port  map  (  A.DIGIT  *>  DIGIT.C, 

B.DIGIT  »>  SD.MUMBCIO), 

W.OUT  *>  W.ARB(IO), 

U.OUT  *>  U.ARR(IO)); 

Mil  :  MO.MULT 

generic  map  (  TECHMOLOGY.SCALE  »>  TECHMOLOGY.SCALE  ) 
port  map  (  A.DIGIT  «>  DIGIT.C, 

B.DIGIT  «>  SD.MUMB(ll), 

W.OUT  *>  W.ARR(ll), 

U.OUT  =>  U.ARR(ll)); 
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M12  :  MO.MULT 

generic  map  (  TECHNOLOGY.SCALE  *>  TECHNOLOGY. SCALE  ) 
port  map  (  A.DIGIT  ■>  DIGIT.C, 

B.DIGIT  »>  SD.1IUMB(12) , 

H.OUT  ->  H_ARR(12), 

U.OUT  «>  U_ARR(12)); 

M13  :  MO.MULT 

generic  map  (  TECHMOLOGY.SCALE  «>  TECHHOLOGY.SCALE  ) 
port  map  (  A.DIGIT  «>  DIGIT.C, 

B.DIGIT  »>  SD_NUMB(13), 

H.OUT  «>  H.ARR(13), 

U.OUT  *>  U.ARR(13)); 

M14  :  MO.MULT 

generic  map  (  TECHMOLOGY.SCALE  «>  TECHMOLOGY.SCALE  ) 
port  map  (  A.DIGIT  *>  DIGIT.C, 

B.DIGIT  »>  SD.MUMB(14), 

H.OUT  =>  H.ARR(14), 

U.OUT  «>  U.ARR(14)); 

MIS  :  MO.MULT 

generic  map  (  TECHMOLOGY.SCALE  *>  TECHMOLOGY.SCALE  ) 
port  map  (  A.DIGIT  »>  DIGIT.C, 

B.DIGIT  »>  SD.MUMBdS), 

H.OUT  ->  H.ARRdS), 

U.OUT  »>  U.ARRCIS)); 

UDIG(O)  <=  U.T0.T(  U.ARR(O)); 

UDIGd)  <=  U.TO.TC  U.ARRd)); 

UDIG(2)  <*  U.TO.TC  U.ARR(2)); 

UDIG(3)  <=  U.TO.TC  U.ARRC3)); 

UDIGC4)  <=  U.TO.TC  U.ARRC4)); 

UDIGC5)  <=  U.TO.TC  U.ARRC5)); 

UDIGC6)  <=  U.TO.TC  U.ARRC6)); 

UDIGC7)  <=  U.TO.TC  U.ARRC7)); 

UDIGC8)  <*  U.TO.TC  U.ARRC8)); 

UDIGO)  <=  U.TO.TC  U.ARRC9)); 

UDIGClO)  <=  U.TO.TC  U.ARRClO)); 

UDIGCll)  <=  U.TO.TC  U.ARRCll)); 

UDIGd2)  <=  U.TO.TC  U.ARRC12)); 

UDIGC13)  <*  U.TO.TC  U.ARRCIS)); 

UDIGC14)  <*  U.TO.TC  U.ARRC14)); 

UDIGC15)  <»  U.TO.TC  U.ARRCIS)); 
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510  :  Sl.ADDER 

generic  map  (  TECHNOLOGY_SCALE  «>  TECHNDLOGY.SCALE  ) 
port  map  (  SDl.in  «>  UDIG(l), 

SD2.in  «>  M_ARR(0), 

ADD.SUB  «>  ADD.CHTL, 

X.out  ■>  X.ARR(O), 

T.out  «>  T_ARR(0)); 

511  :  Sl.ADDER 

generic  map  (  TECHNDLOGY.SCALE  »>  TECHNDLOGY.SCALE  ) 
port  map  (  SDl.in  ■>  UDIG(2). 

SD2.in  ->  H.ARR(l), 

ADD.SUB  »>  ADD.CNTL, 

X.out  *>  X.ARR(l), 

T.out  =>  T.ARR(l)); 

512  :  Sl.ADDER 

generic  map  (  TECHNDLOGY.SCALE  »>  TECHNDLOGY.SCALE  ) 
port  map  (SDl.in  *>  UDIG(3) , 

SD2.in  »>  H.ARR(2), 

ADD.SUB  *>  ADD.CNTL, 

X.out  *>  X.ARR(2), 

T.out  «>  T.ARR(2)); 

513  :  Sl.ADDER 

generic  map  (  TECHNDLOGY.SCALE  »>  TECHNDLOGY.SCALE  ) 
port  map  (  SDl.in  «>  UDIG(4), 

SD2.in  «>  H.ARR(3), 

ADD.SUB  »>  ADD.CHTL, 

X.out  «>  X.ARR(3), 

T.out  «>  T.ARR(3)); 

514  :  Sl.ADDER 

generic  map  (  TECHNDLOGY.SCALE  ■>  TECHNDLOGY.SCALE  ) 
port  map  (  SDl.in  UDIG(5), 

SD2.in  «>  H.ARR(4), 

ADD.SUB  *>  ADD.CNTL, 

X.out  »>  X.ARR(4), 

T.out  =>  T.ARR(4)); 

515  :  Sl.ADDER 

generic  map  (  TECHNDLOGY.SCALE  »>  TECHNDLOGY.SCALE  ) 
port  map  (  SDl.in  »>  UDIG(6) , 

SD2.in  »>  H.ARR(5), 
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ADD.SUB  *>  ADD.CNTL, 

X_out  *>  X_ARR(S) , 

T.out  ->  T.ARR(5)); 

516  :  SI .ADDER 

generic  map  (  TECHMOLOGY.SCALE  «>  TECHKOLOGY.SCALE  ) 
port  map  (  SDl.in  *>  UDIG(7) , 

SD2_in  *>  W_ARR(6), 

ADD.SUB  =>  ADD.CNTL, 

X.out  *>  X_ARR(6) , 

T.out  =>  T_ARR(6)); 

517  :  Sl.ADDER 

generic  map  (  TECHNOLOGY.SCALE  *>  TECHNOLOGY.SCALE  ) 
port  map  (SDl.in  =>  UDIG(8), 

SD2.in  *>  W_ARR(7), 

ADD.SUB  =>  ADD.CNTL, 

X.out  »>  X.ARR(7), 

T.out  =>  T.ARR(7)); 

518  ;  Sl.ADDER 

generic  map  (  TECHNOLOGY.SCALE  «>  TECHNOLOGY.SCALE  ) 
port  map  (  SDl.in  *>  UDIG(9) , 

SD2.in  »>  W.ARR(8), 

ADD.SUB  *>  ADD.CNTL, 

X.out  »>  X.ARR(8), 

T.out  *>  T.ARR(8)); 

519  :  Sl.ADDER 

generic  map  (  TECHNOLOGY.SCALE  *>  TECHNOLOGY.SCALE  ) 
port  map  <  SDl.in  »>  UDIG(IO), 

SD2.in  *>  H.ARR(9), 

ADD.SUB  *>  ADD.CNTL, 

X.out  =>  X.ARR(9) , 

T.out  *>  T.ARR(9)); 

SIA  :  Sl.ADDER 

generic  map  (  TECHNOLOGY.SCALE  =>  TECHNOLOGY.SCALE  ) 
port  map  (  SDl.in  «>  UDIG(ll), 

SD2.in  *>  H.ARR(IO), 

ADD.SUB  =>  ADD.CNTL, 

X.out  *>  X.ARR(IO), 

T.out  »>  T.ARR(IO)); 

SIB  :  Sl.ADDER 
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generic  map  (  TECHNOLOGY.SCALE  «>  TECHNOLOGY.SCALE  ) 
port  map  (  SDl_in  «>  UDIG(12), 

SD2_in  «>  H.ARRdl), 

ADD.SUB  »>  ADD.CNTL, 

X.out  *>  X.ARRCll), 

T.out  «>  T.ARRdD); 

SIC  :  Sl.ADDER 

generic  map  (  TECHNOLOGY.SCALE  »>  TECHNOLOGY.SCALE  ) 
port  map  (  SDl.in  =>  UDIGClS), 

SD2.in  »>  H.ARRd2), 

ADD.SUB  «>  ADD.CNTL, 

X.out  ■>  X.ARRd2), 

T.out  *>  T.ARRd2)); 

SID  :  Sl.ADDER 

generic  map  (  TECHNOLOGY.SCALE  =>  TECHNOLOGY.SCALE  ) 
port  map  (  SDl.in  =>  UDIGd4), 

SD2_in  =>  H.ARRda), 

ADD.SUB  =>  ADD.CNTL, 

X.out  *>  X.ARRCIS), 

T.out  =>  T.ARRCIS)); 

SIE  :  Sl.ADDER 

generic  map  (  TECHNOLOGY.SCALE  =>  TECHNOLOGY.SCALE  ) 
port  map  (  SDl.in  =>  UDIGdS), 

SD2.in  =>  H_ARRd4), 

ADD.SUB  =>  ADD.CNTL, 

X.out  =>  X.ARRd4), 

T.out  =>  T.ARRd4)); 

520  :  S2.ADDER 

generic  map  (  TECHNOLOGY.SCALE  *>  TECHNOLOGY.SCALE  ) 
port  map  (  X.in  =>  X.ARR(O) , 

T.in  =>  UDIG(O), 

SD.out  =>  RESULT(O)); 

521  :  S2_ADDER 

generic  map  <  TECHNOLOGY.SCALE  *>  TECHNOLOGY.SCALE  ) 
port  map  (  X.in  =>  X.ARRCl), 

T.in  =>  T.ARR(O), 

SD.out  =>  RESULTd)); 

522  :  S2.ADDER 

generic  map  (  TECHNOLOGY.SCALE  =>  TECHNOLOGY.SCALE  ) 
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port  map  (  X_in  *>  X_ARR(2), 
T_in  »>  T.ARRd), 
SD.out  »>  RESULT(2)); 


523  :  S2_ADDER 

generic  map  (  TECHNOLOGY.SCALE  ->  TECHNOLOGY.SCALE  ) 
port  map  (  X_in  *>  X.ARRO), 

T_in  *>  T_ARR(2) , 

SD.out  *>  RESULTO)); 

524  :  S2_ADDER 

generic  map  (  TECHNOLOGY.SCALE  «>  TECHNOLOGY.SCALE  ) 
port  map  (  X.in  »>  X.ARR(4), 

T.in  *>  T.ARR(3), 

SD.out  *>  RESULT(4)); 

525  :  S2.ADDER 

generic  map  (  TECHNOLOGY.SCALE  =>  TECHNOLOGY.SCALE  ) 
port  map  (  X.in  =>  X.ARR(S) , 

T.in  =>  T.ARR(4), 

SD.out  *>  RESULTCS)); 

526  :  S2.ADDER 

generic  map  (  TECHNOLOGY.SCALE  *>  TECHNOLOGY.SCALE  ) 
port  map  (  X.in  =>  X.ARR(6) , 

T.in  =>  T.ARRCS), 

SD.out  a>  RESULT(6)); 

527  :  S2.ADDER 

generic  map  (  TECHNOLOGY.SCALE  =>  TECHNOLOGY.SCALE  ) 
port  map  (  X.in  =>  X.ARR(7), 

T.in  =>  T.ARR(6), 

SD.out  *>  RESULT(7)); 

528  :  S2.ADDER 

generic  map  (  TECHNOLOGY.SCALE  *>  TECHNOLOGY.SCALE  ) 
port  map  (  X.in  =>  X.ARR(8), 

T.in  =>  T.ARR(7), 

SD.out  =>  RESULT(8)); 

529  :  S2_ ADDER 

generic  map  (  TECHNOLOGY.SCALE  *>  TECHNOLOGY.SCALE  ) 
port  map  (  X.in  ->  X.ARRO) , 

T.in  *>  T.ARR(8), 

SD.out  »>  RESULTO)); 
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S2A  ;  S2_ADDER 

generic  map  (  TECHNOLOGY.SCALE  «>  TECHNOLOGY.SCALE  ) 
port  map  (  X_in  »>  X.ARR(lO), 

T_in  •>  T.ABRO), 

SD_out  «>  RESULT(IO)); 

S2B  :  S2_ADDER 

generic  map  (  TECHNOLOGY.SCALE  «>  TECHNOLOGY.SCALE  ) 
port  map  (  X.in  *>  X.ARR(ll). 

T.in  ->  T.ARR(IO), 

SD.out  «>  RESULT(ll)); 

S2C  :  S2_ADDER 

generic  map  (  TECHNOLOGY.SCALE  =>  TECHNOLOGY.SCALE  ) 
port  map  C  X.in  »>  X_ARR(12), 

T.in  *>  T.ARR(ll), 

SD.out  =>  RESULT(12)); 

S2D  ;  S2. ADDER 

generic  map  (  TECHNOLOGY.SCALE  ■>  TECHNOLOGY.SCALE  ) 
port  map  (  X.in  *>  X.ARR(13) , 

T.in  *>  T.ARR(12), 

SD.out  *>  RESULT(13)); 

S2E  :  S2.ADDER 

generic  map  (  TECHNOLOGY.SCALE  »>  TECHNOLOGY.SCALE  ) 
port  map  (  X.in  *>  X_ARR(14), 

T.in  *>  T.ARR(13), 

SD.out  *>  RESULT(14)); 

RESULT(15)  <=  T.ARR(14); 

RESULTde)  <=  W.ARRdS); 

end  Structural; 
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use  work. SD.DEFINITIONS. all; 
entity  ADDER. 1  is 

generic  (  TECHNOLOGY.SCALE  :  real  :«  1.0  ); 
port  (  SDl  :  in  SD.DIGIT; 

SD2  :  in  SD.DIGIT; 

T.in  :  in  T.TYPE; 

T.out  :  out  T.TYPE; 

SUMr  :  out  SD.DIGIT); 

end  ADDER. 1; 

use  work. SD.DEFINITI DNS. all; 
architecture  Structural  of  ADDER. 1  is 

component  SI. ADDER 

generic  (  TECHNOLOGY.SCALE  :  real  :=  1.0  ); 
port  (  SDl. in  :  in  SD.DIGIT; 

SD2.in  :  in  SD.DIGIT; 

ADD.SUB  :  in  bit; 

X.out  :  out  X.TYPE; 

T.out  ;  out  T.TYPE) ; 

end  component; 

component  S2. ADDER 

generic  (  TECHNOLOGY.SCALE  :  real  :*  1.0  ); 
port  (  X.in  :  in  X.TYPE; 

T.in  :  in  T.TYPE; 

SD.out  :  out  SD.DIGIT  ) ; 

end  component; 

for  all  ;  Sl.ADDER  use  entity  work.Sl.ADDERCBehavioral) ; 
for  all  :  S2_ADDER  use  entity  work.S2.ADDER(Bebavioral) ; 

signal  XDIG  :  X.TYPE; 
signal  ADD.SIG  ;  bit; 

begin 

SI  :  Sl.ADDER 

generic  map  (  TECHNOLOGY.SCALE  *>  TECHNOLOGY.SCALE  ) 
port  map  (  SDl. in  =>  SDl, 

SD2.in  =>  SD2, 

ADD.SUB  =>  ADD.SIG, 

X.out  =>  XDIG, 
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T_out 


■>  T_out  ) ; 


S2  :  S2_ADDER 

generic  map  (  TECHNOLOGY.SCALE  «>  TECHMOLOGY.SCALE  ) 

port  map  (  X.in  ■>  XDIG, 

T_in  ■>  T_in, 

SD_out  *>  SUMr  ) ; 

end  Structural; 
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use  work. SD.DEFINITIONS. all; 
entity  SL2_ADDER  is 


generic  (  TECHNOLOGY.SCALE  :  real  :=  1.0  ); 
port  (  PARTIAL.H  :  in  PARTIAL.?  (  0  to  16  ) ; 
PARTIAL.!  :  in  PARTIAL.?  (  0  to  16  ) ; 
P.out  :  out  PARTIAL.?  (  0  to  17  )); 

end  SL2. ADDER; 

use  work. SD.DEFINITIONS. all; 
architecture  Structural  of  SL2.ADDER  is 


component  ADDER.l 

generic  (  TECHNOLOGY.SCALE  :  real 

port  (SDl  : 

in  SD.DIGIT; 

SD2  ; 

in  SD.DIGIT; 

T.in  : 

in  T.TYPE; 

T.out  : 

out  T.TYPE; 

SUMr  : 

out  SD.DIGIT  ); 

end  component; 


for  all  :  ADDER.l  use  entity  work.ADDER.l (Structural) ; 

signal  T.ARR  :  T.ARRAY  (  0  to  16  ) ; 
begin 

T.ARR(O)  <=  PARTIAL.H(O); 

ADDO  :  ADDER.l 

generic  map  (  TECHNOLOGY.SCALE  =>  TECHNOLOGY.SCALE  ) 
port  map  (  SDl  =>  PARTIAL.H (  1  ) , 

SD2  =>  PARTIAL.LC  0  ), 

T.in  =>  T.ARRC  0  ), 

T.out  =>  T.ARRC  1  ), 

SUMr  =>  P.out (  0  )  ); 

ADDl  :  ADDER.l 

generic  map  (  TECHNOLOGY.SCALE  =>  TECHNOLOGY.SCALE  ) 
port  map  (  SDl  =>  PARTIAL_H(  2  ), 

SD2  =>  PARTIAL.LC  1  ), 

T.in  =>  T.ARRC  1  ). 

T.out  =>  T.ARRC  2  ), 

SUMr  =>  P.out (  1  )  ); 
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ADD2 


ADD3 


ADD4 


ADDS 


ADD6 


ADD7 


ADDER. 1 

generic  map  (  TECHNOLOGY.SCALE  »>  TECHKOLOGY.SCALE  ) 
port  map  (  SDl  *>  PARTIAL.H(  3  ) , 

SD2  »>  PARTIAL.LC  2  ), 

T.in  *>  T.ARRC  2  ), 

T.out  *>  T.ARRC  3  ), 

SUMr  *>  P.outC  2  )  ); 

ADDER. 1 

generic  map  (  TECHNOLOGY.SCALE  »>  TECHNOLOGY.SCALE  ) 
port  map  (  SDl  =>  PARTIAL.HC  4  ), 

SD2  »>  PARTIAL.LC  3  ), 

T.in  »>  T.ARRC  3  ), 

T.out  a>  T.ARRC  4  ), 

SUMr  =>  P.outC  3  )  ); 

ADDER. 1 

generic  map  C  TECHNOLOGY.SCALE  =>  TECHNOLOGY.SCALE  ) 
port  map  C  SDl  =>  PARTIAL.HC  5  ), 

SD2  *>  PARTIAL.LC  4  ), 

T.in  =>  T.ARRC  4  ), 

T.out  «>  T.ARRC  5  ), 

SUMr  *>  P.outC  4  )  ); 

ADDER. 1 

generic  map  C  TECHNOLOGY.SCALE  =>  TECHNOLOGY.SCALE  ) 
port  map  C  SDl  =>  PARTIAL.HC  6  ) , 

SD2  »>  PARTIAL.LC  5  ), 

T.in  *>  T.ARRC  5  ), 

T.out  *>  T.ARRC  6  ), 

SUMr  *>  P.outC  5  )  ); 


:  ADDER. 1 

generic  map  C  TECHNOLOGY.SCALE  =>  TECHNOLOGY.SCALE  ) 
port  map  C  SDl  *>  PARTIAL.HC  7  ), 

SD2  *>  PARTIAL.LC  6  ), 

T.in  *>  T.ARRC  6  ), 

T.out  a>  T.ARRC  7  ), 

SUMr  =>  P.outC  6  )  ); 


:  ADDER. 1 

generic  map  C  TECHNOLOGY.SCALE  »>  TECHNOLOGY.SCALE  ) 
port  map  C  SDl  *>  PARTIAL.HC  8  ), 

SD2  *>  PARTIAL.LC  7  ), 
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T_in  »>  T.ARRC  7  ). 
T.out  »>  T.ARRC  8  ), 
SUMr  *>  P.outC  7  )  ): 


ADDS  :  ADDER. 1 

generic  map  (  TECHNOLOGY.SCALE  «>  TECHNOLOGY.SCALE  ) 
port  map  C  SDl  *>  PARTIAL.HC  9  ), 

SD2  =>  PARTIAL.LC  8  ), 

T.in  *>  T.ARRC  8  ), 

T.out  *>  T.ARRC  9  ), 

SUMr  *>  P.outC  8  )  ); 


ADD9  :  ADDER. 1 

generic  map  C  TECHNOLOGY.SCALE  *>  TECHNOLOGY.SCALE  ) 
port  map  C  SDl  =>  PARTIAL.HC  10  ), 

SD2  *>  PARTIAL.LC  9  ), 

T.in  =>  T.ARRC  9  ), 

T.out  *>  T.ARRC  10  ), 

SUMr  =>  P.outC  9  )  ); 


ADDA  :  ADDER. 1 

generic  map  C  TECHNOLOGY.SCALE  »>  TECHNOLOGY.SCALE  ) 
port  map  C  SDl  =>  PARTIAL.HC  11  ), 

SD2  =>  PARTIAL.LC  10  ), 

T.in  =>  T.ARRC  10  ), 

T.out  =>  T.ARRC  11  ), 

SUMr  =>  P.outC  10  )  ); 


ADDB  :  ADDER. 1 

generic  map  C  TECHNOLOGY.SCALE  =>  TECHNOLOGY.SCALE  ) 
port  map  C  SDl  =>  PARTIAL.HC  12  ) , 

SD2  =>  PARTIAL.LC  11  ), 

T.in  =>  T.ARRC  11  ), 

T.out  =>  T.ARRC  12  ), 

SUMr  =>  P.outC  11  )  ); 


ADDC  :  ADDER. 1 

generic  map  C  TECHNOLOGY.SCALE  »>  TECHNOLOGY.SCALE  ) 
port  map  C  SDl  =>  PARTIAL.HC  13  ), 

SD2  =>  PARTIAL.LC  12  ), 

T.in  =>  T.ARRC  12  ), 

T.out  =>  T.ARRC  13  ), 

SUMr  =>  P.outC  12  )  ); 

ADDD  ;  ADDER. 1 
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generic  map  (  TECHNOLOGY.SCALE  ■>  TECHHOLOGY.SCALE  ) 
port  map  (  SDl  »>  PARTIAL.HC  14  ) , 

SD2  =>  PART1AL_L(  13  ), 

T_in  *>  T.ARRC  13  ), 

T.out  »>  T_ARR(  14  ), 

SUMr  =>  P.out(  13  )  ); 


ADDE  :  ADDER. 1 

generic  map  (  TECHNOLOGY.SCALE  «>  TECHNOLOGY.SCALE  ) 
port  map  (  SDl  *>  PARTIAL.HC  15  ) , 

SD2  *>  PARTIAL.LC  14  ), 

T.in  *>  T.ARRC  14  ), 

T.out  =>  T.ARRC  15  ), 

SUMr  =>  P.outC  14  )  ); 


ADDF  :  ADDER. 1 

generic  map  C  TECHNOLOGY.SCALE  =>  TECHNOLOGY.SCALE  ) 
port  map  C  SDl  =>  PARTIAL.HC  16  ), 

SD2  *>  PARTIAL.LC  15  ), 

T.in  «>  T.ARRC  15  ), 

T.out  =>  T.ARRC  16  ), 

SUMr  a>  P.outC  15  )  ); 


P.outC  16  )  <a  T.arrC  16  ); 
P.outC  17  )  <«  PARTIAL.LC  16  ); 


end  Structural ; 


C-38 


use  work. SD.DEFINITIONS. all; 
entity  SL3_ADDER  is 


generic  (  TECHNOLOGY.SCALE  :  real  ;«  1,0  ); 
port  (  PARTIAL.H  :  in  PARTIAL.?  (  0  to  17  ) ; 
PARTIAL.L  :  in  PARTIAL.?  (  0  to  17  ) ; 
P.out  :  out  PARTIAL.?  (  0  to  19  )); 

end  SL3_ADDER; 

use  work. SD.DEFINITIONS. all; 
architecture  Structural  of  SL3.ADDER  is 

component  ADDER. 1 

generic  (  TECHNOLOGY.SCALE  :  real  :=  1.0  ); 


SDl 

:  in  SD.DIGIT; 

SD2 

:  in  SD.DIGIT; 

T.in 

;  in  T.TYPE; 

T.out 

:  out  T.TYPE; 

SUMr 

:  out  SD.DIGIT  ) ; 

end  component ; 

for  all  :  ADDER.l  use  entity  work. ADDER. 1 (Structural) ; 

signal  T.ARR  :  T.ARRAY  (  0  to  16  ) ; 
begin 

P.outCO)  <=  PARTIAL.H (0) ; 

T_ARR(0)  <=  PARTIAL.Hd); 

ADDO  :  ADDER.l 

generic  map  (  TECHNOLOGY.SCALE  »>  TECHNOLOGY.SCALE  ) 
port  map  (  SDl  =>  PARTIAL_H(  2  ) , 

SD2  =>  PARTIAL.L (  0  ), 

T.in  =>  T.ARRC  0  ), 

T.out  =>  T.ARRC  1  ), 

SUMr  =>  P_out(  1  )  ); 

ADDl  :  ADDER.l 

generic  map  (  TECHNOLOGY.SCALE  *>  TECHNOLOGY.SCALE  ) 
port  map  (  SDl  =>  PARTIAL.H(  3  ) , 

SD2  *>  PARTIAL.LC  1  ), 

T.in  =>  T.ARRC  1  ), 
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ADD2 


ADD3 


ADD4 


ADDS 


ADDS 


ADD7 


T.out  =>  T_ARR(  2  ), 
SUMr  =>  P_out(  2  )  ); 


ADDER. 1 

generic  map  (  TECHNOLOGY.SCALE  »>  TECHNOLOGY.SCALE  ) 
port  map  (  SDl  =>  PARTIAL_H(  4  ), 

SD2  =>  PARTIAL.LC  2  ), 

T.in  =>  T_ARR(  2  ). 

T.out  ->  T.ARR(  3  >, 

SUMr  =>  P_out(  3  )  ); 

ADDER. 1 

generic  map  (  TECHNOLOGY.SCALE  *>  TECHNOLOGY.SCALE  ) 
port  map  (  SDl  =>  PARTIAL.H(  5  ), 

SD2  =>  PARTIAL.L(  3  ). 

T.in  =>  T.ARRC  3  ). 

T.out  =>  T.ARRC  4  ), 

SUMr  =>  P.outC  4  )  ); 

ADDER. 1 

generic  map  (  TECHNOLOGY.SCALE  =>  TECHNOLOGY.SCALE  ) 
port  map  (  SDl  *>  PARTIAL.HC  6  ) , 

SD2  =>  PARTIAL.LC  4  ), 

T.in  =>  T.ARRC  4  ), 

T.out  =>  T.ARRC  S  ), 

SUMr  =>  P.outC  5  )  ); 


ADDER. 1 

generic  map  C  TECHNOLOGY.SCALE  =>  TECHNOLOGY.SCALE  ) 
port  map  C  SDl  =>  PARTIAL.HC  7  ), 

SD2  =>  PARTIAL.LC  5  ), 

T.in  =>  T.ARRC  5  ), 

T.out  =>  T.ARRC  6  ), 

SUMr  =>  P.outC  6  )  ); 


:  ADDER. 1 

generic  map  C  TECHNOLOGY.SCALE  =>  TECHNOLOGY.SCALE  ) 
port  map  C  SDl  =>  PARTIAL.HC  8  ), 

SD2  =>  PARTIAL.LC  6  ). 

T.in  =>  T.ARRC  6  ), 

T.out  »>  T.ARRC  7  ), 

SUMr  =»>  P.outC  7  )  ); 


ADDER. 1 

generic  map  C  TECHNOLOGY.SCALE  =>  TECHNOLOGY.SCALE  ) 
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port  map  (  SDl  =>  PARTIAL_H(  9  ) , 
SD2  =>  PARTIAL,L(  7  ), 

T.in  =>  T_ARR(  7  ), 

T.out  *>  T_ARR(  8  ), 

SUMr  *>  P_out(  8  )  ); 


ADDS  :  ADDER. 1 

generic  map  (  TECHNOLOGY.SCALE  *>  TECHNOLOGY.SCALE  ) 
port  map  (  SDl  =>  PARTIAL_H(  10  ), 

SD2  =>  PARTIAL.LC  8  ), 

T.in  =>  T_ARR(  8  ). 

T.out  =>  T.ARRC  9  ), 

SUMr  »>  P_out(  9  )  ); 


ADD9  ;  ADDER. 1 

generic  map  (  TECHNOLOGY.SCALE  =>  TECHNOLOGY.SCALE  ) 
port  map  (  SDl  =>  PARTIAL_H(  11  ), 

SD2  =>  PARTIAL.LC  9  ), 

T.in  =>  T.ARRC  9  ), 

T.out  =>  T.ARRC  10  ), 

SUMr  =>  P.outC  10  )  ); 


ADDA  :  ADDER. 1 

generic  map  (  TECHNOLOGY.SCALE  »>  TECHNOLOGY.SCALE  ) 
port  map  C  SDl  =>  PARTIAL.HC  12  ), 

SD2  *>  PARTIAL.LC  10  ), 

T.in  =>  T.ARRC  10  ), 

T.out  =>  T.ARRC  11  ), 

SUMr  =>  P.outC  11  )  ); 


ADDS  :  ADDER. 1 

generic  map  C  TECHNOLOGY.SCALE  *>  TECHNOLOGY.SCALE  ) 
port  map  C  SDl  =>  PARTIAL.HC  13  ), 

SD2  =>  PARTIAL.LC  11  ), 

T.in  =>  T.ARRC  11  ), 

T.out  =>  T.ARRC  12  ), 

SUMr  =>  P.outC  12  )  ); 


ADDC  :  ADDER. 1 

generic  map  C  TECHNOLOGY.SCALE  =>  TECHNOLOGY.SCALE  ) 
port  map  C  SDl  »>  PARTIAL.HC  14  ) , 

SD2  *>  PARTIAL.LC  12  ), 

T.in  »>  T.ARRC  12  ), 

T.out  »>  T.ARRC  13  ), 

SUMr  *>  P.outC  13  )  ); 
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ADDD  :  ADDER. 1 

generic  map  (  TECHNOLOGY.SCALE  »>  TECHNOLOGY.SCALE  ) 
port  map  (  SDl  PARTIAL_H(  IS  ) , 

SD2  ■>  PARTIAL.LC  13  ), 

T.in  ->  T.ARRC  13  ), 

T.out  ■>  T.ARRC  14  ), 

SUMr  ®>  P.outC  14  )  ); 


ADDE  :  ADDER. 1 

generic  map  (  TECHNOLOGY.SCALE  «>  TECHNOLOGY.SCALE  ) 
port  map  (  SDl  PARTIAL.HC  16  ) , 

SD2  »>  PARTIAL.LC  14  ), 

T.in  =>  T.ARRC  14  ). 

T.out  =>  T.ARRC  15  ). 

SUMr  =>  P.outC  IS  )  ); 


ADDF  :  ADDER. 1 

generic  map  C  TECHNOLOGY.SCALE  =>  TECHNOLOGY.SCALE  ) 
port  map  C  SDl  PARTIAL.HC  17  ) , 

SD2  =>  PARTIAL.LC  15  ), 

T.in  =>  T.ARRC  15  ), 

T.out  *>  T.ARRC  16  ), 

SUMr  =>  P.outC  16  )  ); 

P.outC  17  )  T.ARRC  16  ) ; 
ploutC  18  )  <=  PARTIAL.LC  16  ); 
ploutC  19  )  <=  PARTIAL.LC  17  ); 

end  Structural; 
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use  work. SD.DEFINITIONS. all; 
entity  SL4_ADDER  is 


generic  (  TECHNOLOGY.SCALE  :  real  :«  1.0  ); 
port  (  PARTIAL.H  :  in  PARTIAL.?  (  0  to  19  ) ; 

PARTIAL.L  :  in  PARTIAL.?  (  0  to  19  ); 

P.out  :  out  PARTIAL.?  (  0  to  23  )); 

end  SL4_ADDER; 

use  work. SD.DEFINITIONS. all; 
architecture  Structural  of  SL4. ADDER  is 

component  ADDER. 1 

generic  (  TECHNOLOGY.SCALE  :  real  :=  1.0  ); 
port  (  SDl  :  in  SD.DIGIT; 

SD2  :  in  SD.DIGIT; 

T.in  :  in  T.TYPE; 

T.out  :  out  T.TYPE; 

SUMr  :  out  SD.DIGIT  ); 
end  component; 

for  all  :  ADDER.l  use  entity  work.ADDER.l (Structural) ; 
signal  T.ARR  ;  T.ARRAY  (  0  to  16  ) ; 
begin 

P.out(O)  <=»  PARTIAL.HCO); 

P.outCl)  <=  PARTIAL.HCl) ; 

P.out(2)  <*  PARTIAL_H(2); 

T.ARRCO)  <*  PARTIAL_H(3) ; 

ADDO  :  ADDER.l 

generic  map  (  TECHNOLOGY.SCALE  »>  TECHNOLOGY.SCALE  ) 
port  map  (  SDl  =>  PARTIAL.H(  4  ), 

SD2  =>  PARTIAL.LC  0  ), 

T.in  =>  T.ARRC  0  ), 

T.out  =>  T_ARR(  1  ), 

SUMr  =>  P.out (  3  )  ); 

ADDl  :  ADDER.l 

generic  map  (  TECHNOLOGY.SCALE  *>  TECHNOLOGY.SCALE  ) 
port  map  (  SDl  =>  PARTIAL.H(  5  ), 
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ADD2 


ADD3 


ADD4 


ADDS 


ADD6 


SD2  ■>  PARTIAL_L(  1  ), 

T_in  =>  T_ARR(  1  ), 

T.out  ->  T_ARR(  2  ), 

SUMr  ■>  P.out(  4  )  ); 


ADDER, 1 

generic  map  <  TECHNOLOGY,SCALE  *>  TECHMOLOGY.SCALE  ) 
port  map  (  SDl  =>  PARTIAL,H(  6  ) , 

SD2  =>  PARTIAL.LC  2  ). 

T_in  =>  T_ARR(  2  ), 

T.out  =>  T_ARR(  3  ), 

SUMr  =>  P_out(  5  )  ); 


ADDER, 1 

generic  map  (  TECHNOLOGY.SCALE  »>  TECHNOLOGY. SCALE  ) 
port  map  (  SDl  »>  PARTIAL.H(  7  ), 

SD2  »>  PARTIAL.LC  3  ), 

T.in  =>  T.ARRC  3  ), 

T.out  =>  T.ARR(  4  ), 

SUMr  =>  P.outC  6  )  ); 


ADDER. 1 

generic  map  (  TECHNOLOGY.SCALE  =»>  TECHNOLOGY.SCALE  ) 
port  map  (  SDl  =>  PARTIAL.H(  8  ), 

SD2  *>  PARTIAL.LC  4  ), 

T.in  =>  T.ARRC  4  ), 

T.out  =>  T.ARRC  S  ), 

SUMr  *>  P.outC  7  )  ); 


ADDER. 1 

generic  map  C  TECHNOLOGY.SCALE  =>  TECHNOLOGY.SCALE  ) 
port  map  C  SDl  =>  PARTIAL.HC  9  ), 

SD2  =>  PARTIAL.LC  5  ), 

T.in  =>  T.ARRC  5  ), 

T.out  =>  T.ARRC  6  ), 

SUMr  =>  P.outC  8  )  ); 


ADDER. 1 

generic  map  C  TECHNOLOGY.SCALE  »>  TECHNOLOGY.SCALE  ) 
port  map  C  SDl  =>  PARTIAL.HC  10  ) , 

SD2  =>  PARTIAL.LC  6  ), 

T.in  =>  T.ARRC  6  ), 

T.out  *>  T.ARRC  7  ), 

SUMr  *>  P.outC  9  )  ); 
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ADD7 


ADDS 


ADDS 


ADDA 


ADDS 


ADDC 


:  ADDER. 1 

generic  map  (  TECHNOLOGY.SCALE  ■>  TECHNOLOGY.SCALE  ) 
port  map  (  SDl  ■>  PARTIAL_H(  11  ), 

SD2  =>  PARTIAL.LC  7  ), 

T.in  «>  T.ARRC  7  ), 

T.out  *>  T.ARRC  8  ), 

SUMr  =>  P.outC  10  )  ); 


:  ADDER. 1 

generic  map  (  TECHNOLOGY.SCALE  «>  TECHNOLOGY.SCALE  ) 
port  map  (SDl  =>  PARTIAL.HC  12  ) , 

SD2  *>  PARTIAL.LC  8  ), 

T.in  ■>  T.ARRC  8  ), 

T.out  =>  T.ARRC  9  ), 

SUMr  *>  P.outC  11  )  ); 


ADDER. 1 

generic  map  (  TECHNOLOGY.SCALE  =>  TECHNOLOGY.SCALE  ) 
port  map  (  SDl  =>  PARTIAL.HC  13  ), 

SD2  =>  PARTIAL.LC  9  ), 

T.in  =>  T.ARRC  9  ), 

T.out  =>  T.ARRC  10  ), 

SUMr  *>  P.outC  12  )  ); 


ADDER. 1 

generic  map  (  TECHNOLOGY.SCALE  «>  TECHNOLOGY.SCALE  ) 
port  map  (SDl  »>  PARTIAL.HC  14  ) , 

SD2  »>  PARTIAL.LC  10  ), 

T.in  =>  T.ARRC  10  ), 

T.out  »>  T.ARRC  11  ), 

SUMr  »>  P.outC  13  )  ); 


ADDER. 1 

generic  map  (  TECHNOLOGY.SCALE  *>  TECHNOLOGY.SCALE  ) 
port  map  (  SDl  *>  PARTIAL.HC  15  ), 

SD2  =>  PARTIAL.L(  11  ), 

T.in  »>  T.ARRC  11  ), 

T.out  *>  T.ARRC  12  ), 

SUMr  *>  P.outC  14  )  ); 


ADDER. 1 

generic  map  (  TECHNOLOGY.SCALE  *>  TECHNOLOGY.SCALE  ) 
port  map  (  SDl  »>  PARTIAL.HC  16  ). 

SD2  =>  PARTIAL.LC  12  ), 

T.in  =>  T.ARRC  12  ). 
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T.out  »>  T_ARR(  13  ), 

SUMr  ■>  P_out(  15  )  ); 

ADDD  :  ADDER. 1 

generic  map  (  TECHNOLOGY.SCALE  «>  TECHNOLOGY.SCALE  ) 
port  map  (  SDl  »>  PARTIAL_H(  17  ), 

SD2  «>  PARTIAL.LC  13  ), 

T.in  =>  T.ARRC  13  ), 

T.out  *>  T.ARR(  14  ), 

SUMr  a>  P.outC  16  )  ); 

ADDE  :  ADDER. 1 

generic  map  (  TECHHOLOGY.SCALE  »>  TECHMOLOGY.SCALE  ) 
port  map  (  SDl  *>  PARTIAL.H(  18  ), 

SD2  a>  PARTIAL.! (  14  ). 

T.in  *>  T.ARRC  14  ), 

T.out  a>  T.ARRC  15  ), 

SUMr  a>  P.outC  17  )  ); 

ADDF  ;  ADDER. 1 

generic  map  C  TECHNOLOGY.SCALE  *>  TECHNOLOGY.SCALE  ) 
port  map  C  SDl  *>  PARTI AL.HC  19  ) , 

SD2  a>  PARTIAL.LC  15  ), 

T.in  a>  T.ARRC  15  ), 

T.out  a>  T.ARRC  16  ), 

SUMr  a>  P.outC  18  )  ); 

P.outC  19  )  <=  T.ARRC  16  ); 

P.outC  20  )  <=  PARTIAL.LC  16  ); 

P.outC  21  )  <=  PARTIAL.LC  17  ); 

P.outC  22  )  <=  PARTIAL.LC  18  ) ; 

P.outC  23  )  <=  PARTIAL.LC  19  ); 

end  Structural; 
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use  work. SD.DEFINITIONS. all; 
entity  SLS.ADDER  is 

generic  (  TECHNOLOGY.SCALE  :  real  :>  1.0  ); 
port  (  PARTIAL.H  :  in  PARTIAL.?  (  0  to  23  ) ; 

PARTIAL.L  :  in  PARTIAL.?  (  0  to  23  ); 

P.out  :  out  PARTIAL.?  (  0  to  31  )); 

end  SLS.ADDER; 

use  work. SD.DEFINITIONS. all; 
architecture  Structural  of  SLS.ADDER  is 

component  ADDER. 1 

generic  (  TECHNOLOGY.SCALE  :  real  :=  1.0  ); 
port  (  SDl  :  in  SD.DIGIT; 

SD2  :  in  SD.DIGIT; 

T.in  :  in  T.TYPE; 

T.out  ;  out  T.TYPE; 

SUMr  :  out  SD.DIGIT  ); 
end  component; 

for  all  :  ADDER.l  use  entity  work. ADDER. 1 (Structural) ; 
signal  T.ARR  ;  T.ARRAY  (  0  to  16  ); 
begin 

P.out(O)  <=  PARTIAL.H(O) ; 

P.out(l)  <=  PARTIAL.H(l); 

P.out(2)  <=  PARTIAL.H (2) ; 

P.out(3)  <=  PARTIAL.H (3) ; 

P.out (4)  <=  PARTIAL.H (4); 

P.outCS)  <=  PARTIAL.H(S); 

P.out(6)  <=  PARTIAL.H(6); 

T.ARR(O)  <=  PARTIAL.H(7); 

ADDO  :  ADDER.l 

generic  map  (  TECHNOLOGY.SCALE  »>  TECHNOLOGY.SCALE  ) 
port  map  (  SDl  »>  PARTIAL.HC  8  ), 

SD2  =>  PARTIAL.L(  0  ), 

T.in  »>  T.ARR(  0  ). 

T.out  »>  T.ARR(  1  ), 

SUMr  »>  P.outC  7  )  ); 
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ADDl  :  ADDER. 1 

generic  map  (  TECHNOLOGY.SCALE  ■>  TECHHOLOGY. SCALE  ) 
port  map  (  SDl  ■>  PARTIAL.HC  9  ), 

SD2  »>  PARTIAL.LC  1  ), 

T.in  ■>  T.ARR(  1  ), 

T.out  «>  T.ARRC  2  ). 

SUMr  »>  P.outC  8  )  ); 


ADD2  :  ADDER. 1 

generic  map  (  TECHKOLOGY.SCALE  «>  TECHMOLOGY.SCALE  ) 
port  map  (  SDl  «>  PARTIAL.HC  10  ), 

SD2  •>  PARTIAL_L(  2  ), 

T.in  »>  T.ARRC  2  >, 

T.out  *>  T.ARRC  3  ), 

SUMr  «>  P.outC  9  )  ); 


ADD3  :  ADDER. 1 

generic  map  C  TECHNOLOGY.SCALE  «>  TECHMOLOGY.SCALE  ) 
port  map  C  SDl  »>  PARTIAL.HC  11  ), 

SD2  »>  PARTIAL.LC  3  ), 

T.in  »>  T.ARRC  3  ), 

T.out  =>  T.ARRC  4  ), 

SUMr  »>  P.outC  10  )  ); 


ADD4  :  ADDER. 1 

generic  map  C  TECHNOLOGY.SCALE  «>  TECHNOLOGY.SCALE  ) 
port  map  C  SDl  »>  PARTIAL.HC  12  ), 

SD2  »>  PARTIAL.LC  4  ), 

T.in  =>  T.ARRC  4  ), 

T.out  *>  T.ARRC  5  ) , 

SUMr  *>  P.outC  11  )  ); 


ADDS  :  ADDER. 1 

generic  map  C  TECHNOLOGY.SCALE  »>  TECHNOLOGY.SCALE  ) 
port  map  C  SDl  »>  PARTIAL.HC  13  ), 

SD2  *>  PARTIAL.LC  6  ), 

T.in  «>  T.ARRC  5  ), 

T.out  »>  T.ARRC  6  ), 

SUMr  »>  P.outC  12  )  ); 


ADDS  ;  ADDER. 1 

generic  map  C  TECHNOLOGY.SCALE  *>  TECHNOLOGY.SCALE  ) 
port  map  C  SDl  *>  PARTIAL.HC  14  ) , 

SD2  »>  PARTIAL.LC  6  ), 
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ADD7  : 


ADDS 


ADD9 


ADDA 


ADDB 


ADDC 


T.in  ->  T.ARR(  6  ), 
T.out  ->  T.ARR(  7  ), 
SUMr  »>  P_out<  13  )  ); 


ADDER. 1 

generic  map  (  TECHNOLOGY.SCALE  •>  TECHNOLOGY.SCALE  ) 
port  map  (  SDl  *>  PARTIAL.H(  IS  ), 

SD2  *>  PARTIAL.LC  7  ), 

T.in  *>  T.ARR(  7  ). 

T.out  »>  T.ARR(  8  ), 

SUMr  «>  P.out(  14  )  ); 


ADDER. 1 

generic  map  (  TECHMOLOGY.SCALE  »>  TECHMOLOGY.SCALE  ) 
port  map  (  SDl  »>  PARTIAL_H(  16  ), 

SD2  »>  PARTIAL.L(  8  ), 

T.in  *>  T.ARRC  8  ). 

T.out  »>  T.ARRC  9  ), 

SUMr  *>  P.outC  15  )  ); 

ADDER. 1 

generic  map  (  TECHNOLOGY.SCALE  *>  TECHNOLOGY.SCALE  ) 
port  map  (  SDl  »>  PARTIAL.HC  17  ), 

SD2  «>  PARTIAL.LC  9  ). 

T.in  »>  T.ARRC  9  ), 

T.out  *>  T.ARRC  10  ), 

SUMr  »>  P.outC  16  )  ); 


ADDER. 1 

generic  map  C  TECHNOLOGY.SCALE  *>  TECHNOLOGY.SCALE  ) 
port  map  C  SDl  *>  PARTIAL.HC  18  ) , 

SD2  *>  PARTIAL.LC  10  ), 

T.in  *>  T.ARRC  10  ), 

T.out  *>  T.ARRC  11  ), 

SUMr  *>  P.outC  17  )  ); 


:  ADDER. 1 

generic  map  C  TECHMOLOGY.SCALE  «>  TECHNOLOGY.SCALE  ) 
port  map  C  SDl  «>  PARTIAL.HC  19  ), 

SD2  »>  PARTIAL.LC  11  ), 

T.in  s>  T.ARRC  11  ). 

T.out  *>  T.ARRC  12  ), 

SUMr  »>  P.outC  18  )  ); 


:  ADDER. 1 
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generic  map  (  TECHNOLOGY.SCALE  »>  TECHNOLOGY.SCALE  ) 
port  map  (  SDl  »>  PARTIAL_H(  20  ), 

SD2  «>  PARTIAL_L(  12  ), 

T_in  «>  T_ARR(  12  ), 

T.  out  ■>  T_ARR(  13  ) , 

SUMr  ■>  P_out(  19  )  ); 

ADDD  :  ADDER. 1 

generic  map  (  TECHNOLOGY.SCALE  «>  TECHNDLOGY.SCALE  ) 
port  map  (  SDl  ■>  PARTIAL_H(  21  ) , 

SD2  «>  PARTIAL.LC  13  ), 

T_in  ->  T.ARRC  13  ), 

T.out  ■>  T_ARR(  14  ), 

SUMr  *>  P.out(  20  )  ); 

ADDE  :  ADDER. 1 

generic  map  (  TECHNOLOGY.SCALE  *>  TECHNOLOGY.SCALE  ) 
port  map  (  SDl  »>  PARTIAL.H(  22  ), 

SD2  «>  PARTIAL.L(  14  ), 

T.in  «>  T.ARRC  14  ), 

T.out  *>  T.ARRC  IS  ), 

SUMr  «>  P.outC  21  )  ): 

ADDF  :  ADDER. 1 

generic  map  C  TECHNOLOGY.SCALE  *>  TECHNOLOGY.SCALE  ) 
port  map  C  SDl  »>  PARTIAL.HC  23  ), 

SD2  *>  PARTIAL.LC  15  ), 

T.in  »>  T.ARRC  15  ), 

T.out  «>  T.ARRC  16  ), 

SUMr  *>  P.outC  22  )  ); 

P.outC  23  )  <=  T.ARRC  16  ); 

P.outC  24  )  <=  PARTIAL.LC  16  ); 

P.outC  25  )  <=  PARTIAL.LC  17  ) ; 

P.outC  26  )  <=  PARTIAL.LC  18  ); 

P.outC  27  )  <=  PARTIAL.LC  19  ); 

P.outC  28  )  <=  PARTIAL.LC  20  ); 

P.outC  29  )  <=  PARTIAL.LC  21  ); 

P.outC  30  )  <=  PARTIAL.LC  22  ); 

P.outC  31  )  <=  PARTIAL.LC  23  ); 

end  Structural; 
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use  work. SD.DEFINITIOKS. all; 
entity  SD.MULT  is 


generic  (  TECHNOLOGY.SCALE  :  real  :■  1.0  ); 
port  (  SD_A  ;  in  SD.NUHBER; 

SD_B  :  in  SD.NUMBER; 

SD_out  :  out  PARTIAL.?  (  0  to  31  )  ) ; 

end  SD.MULT ; 

use  work. SD.DEFINITIONS. all; 
architecture  Structural  of  SD.MULT  is 

component  MULT.BLOCK 

generic  (  TECHNOLOGY.SCALE  :  real  :»  1.0  ); 
port  (  DIGIT.C  :  in  SD.DIGIT; 

SD.NUMB  :  in  SD.NUMBER; 

RESULT  :  out  PARTIAL.?  (  0  to  16)); 
end  component; 

component  SL2.ADDER 

generic  (  TECHNOLOGY.SCALE  :  real  :»  1.0  ); 
port  (  PARTIAL.H  :  in  PARTIAL.?  (  0  to  16  ) ; 
PARTIAL.L  :  in  PARTIAL.?  (  0  to  16  ) ; 
P.out  :  out  PARTIAL.?  (  0  to  17  )); 
end  component ; 

component  SLS.ADDER 

generic  (  TECHNOLOGY.SCALE  :  real  :=  1.0  ); 
port  (  PARTIAL.H  :  in  PARTIAL.?  (  0  to  17  ) ; 

PARTIAL.L  :  in  PARTIAL.?  (  0  to  17  ) ; 

P.out  :  out  PARTIAL.?  (  0  to  19  )); 
end  component; 

component  SL4.ADDER 

generic  (  TECHNOLOGY.SCALE  :  real  :»  1.0  ); 
port  (  PARTIAL.H  :  in  PARTIAL.?  (  0  to  19  ) ; 

PARTIAL.L  :  in  PARTIAL.?  (  0  to  19  ) ; 

P.out  :  out  PARTIAL.?  (  0  to  23  )); 
end  component: 

component  SLS.ADDER 

generic  (  TECHNOLOGY.SCALE  :  real  1.0  ); 
port  (  PARTIAL.H  ;  in  PARTIAL.?  (  0  to  23  ) ; 

PARTIAL.L  :  in  PARTIAL.?  (  0  to  23  ) ; 
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P_out 

end  component; 


:  out  PARTIAL.?  (  0  to  31  )); 


for  all  :  NULT.BLOCX  use  entity  sork.NULT_BLOCK(Structural) ; 
for  all  :  SL2_ADDER  use  entity  vork.SL2_ADDER(Structural) ; 
for  all  :  SL3_ADDER  use  entity  «ork.SL3_ADDER(Structural) ; 
for  all  :  SL4_ADDER  use  entity  Bork.SL4_ADDER(Structural) ; 
for  all  :  SLS.ADDER  use  entity  vork.SL&_ADDER(Structural) ; 

type  PL12  is  array  (  0  to  15  )  of  PARTIAL.PC  0  to  16  ) ; 

type  PL23  is  2u:ray  (  0  to  7  )  of  PARTIAL.PC  0  to  17  ) ; 

type  PL34  is  array  (  0  to  3  )  of  PARTIAL.PC  0  to  19  ) ; 

type  PL45  is  array  C  0  to  1  )  of  PARTIAL.PC  0  to  23  ) ; 

signal  PARTIAL.l  :  PL12; 
signal  PARTI AL.2  :  PL23; 
signal  PARTIAL.3  :  PL34; 
signal  PARTIAL.4  :  PL4S; 

begin 

HUGO  :  MULT.BLOCK 

generic  map  C  TECHNOLOGY.SCALE  «>  TECHNOLOGY.SCALE  ) 
port  map  C  DIGIT.C  «>  SD.AC  0  ), 

SD.KUMB  «>  SD.B, 

RESULT  a>  PARTIAL. IC  0  )  ); 

MUOl  5  MULT.BLOCK 

generic  map  C  TECHNOLOGY.SCALE  «>  TECHNOLOGY.SCALE  ) 
port  map  C  DIGIT.C  »>  SD.AC  1  ) » 

SD.NUMB  «>  SD.B. 

RESULT  =>  PARTIAL. IC  1  )  ); 

MU02  :  MULT.BLOCK 

generic  map  C  TECHNOLOGY.SCALE  ->  TECHNOLOGY.SCALE  ) 
port  map  C  DIGIT.C  *>  SD.AC  2  ) , 

SD.NUMB  «>  SD.B, 

RESULT  «>  PARTIAL. IC  2  )  ); 

MU03  :  MULT.BLOCK 

generic  map  C  TECHNOLOGY.SCALE  *>  TECHNOLOGY.SCALE  ) 
port  map  C  DIGIT.C  *>  SD.AC  3  ) , 

SD.NUMB  ■>  SD.B, 

RESULT  ->  PARTIAL. 1C  3  )  ); 
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MU04  :  KULT.BLOCK 

generic  map  (  TECHHOLOGY.SCALE  »>  TECHNOLOGY.SCALE  ) 
port  map  (  DIGIT^C  «>  SD_A(  4  ), 

SD.HUMB  «>  SD_B, 

RESULT  ->  PARTIAL. 1(  4  )  ); 

MUOS  :  MULT.BLOCK 

generic  map  <  TECHKQLOGY.SCALE  «>  TECHHOLOGY.SCALE  ) 
port  map  (  DIGIT.C  ■>  SD_A(  S  ) , 

SD.NUMB  «>  SD.B, 

RESULT  »>  PARTIAL.K  5  )  ); 

MU06  ;  MULT.BLOCK 

generic  map  (  TECHNOLOGY.SCALE  »>  TECHNOLOGY.SCALE  ) 
port  map  (  DIGIT.C  »>  SD.A(  6  ) , 

SD.NUMB  »>  SD.B, 

RESULT  *>  PARTIAL.K  6  )  ); 

MU07  :  MULT.BLOCK 

generic  map  (  TECHNOLOGY.SCALE  »>  TECHNOLOGY.SCALE  ) 
port  map  (  DIGIT.C  *>  SD.A(  7  ) , 

SD.NUMB  »>  SD.B, 

RESULT  *>  PARTIAL.K  7  )  ); 

MUOS  :  MULT.BLOCK 

generic  map  (  TECHNOLOGY.SCALE  »>  TECHNOLOGY.SCALE  ) 
port  map  (  DIGIT.C  =>  SD.A(  8  ) , 

SD.NUMB  «>  SD.B, 

RESULT  »>  PARTIAL.K  8  )  ); 

MU09  :  MULT.BLOCK 

generic  map  (  TECHNOLOGY.SCALE  »>  TECHNOLOGY.SCALE  ) 
port  map  (  DIGIT.C  *>  SD.A(  9  ), 

SD.NUMB  *>  SD.B, 

RESULT  «>  PARTIAL.K  9  )  ); 

MUIO  :  MULT.BLOCK 

generic  map  <  TECHNOLOGY.SCALE  *>  TECHNOLOGY.SCALE  ) 
port  map  (  DIGIT.C  *>  SD.A(  10  ) , 

SD.NUMB  »>  SD.B, 

RESULT  »>  PARTIAL.K  10  )  ) ; 

MUll  :  MULT.BLOCK 

generic  map  (  TECHNOLOGY.SCALE  *>  TECHNOLOGY.SCALE  ) 
port  map  (  DIGIT.C  »>  SD.A(  11  ), 
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MU12 


MU13 


MU14 


MU15 


AlO  ; 


All  : 


A12  : 


SD.JTUMB  »>  SD.B, 

RESULT  ->  PARTIAL. 1(  11  )  ); 


:  MULT.BLOCK 

generic  map  (  TECHMOLOGY.SCALE  «>  TECHKOLOGY.SCALE  ) 
port  map  (  DIGIT.C  ■>  SD_A(  12  ) , 

SD.HUMB  «>  SD.B, 

RESULT  ->  PARTIAL.K  12  )  ) ; 


:  MULT.BLOCK 

generic  map  (  TECHNQLOGY.SCALE  «>  TECHNOLOGY.SCALE  ) 
port  map  (  DIGIT.C  *>  SD_A(  13  ), 

SD.MUMB  *>  SD_B, 

RESULT  »>  PARTIAL.K  13  )  ) ; 


:  MULT.BLOCK 

generic  map  (  TECHNOLOGY.SCALE  »>  TECHNOLOGY.SCALE  ) 
port  map  (  DIGIT.C  =>  SD.A(  14  ), 

SD.NUMB  »>  SD.B. 

RESULT  *>  PARTIAL.K  14  )  )  ; 


:  MULT.BLOCK 

generic  map  (  TECHNOLOGY.SCALE  »>  TECHNOLOGY.SCALE  ) 
port  map  (  DIGIT.C  *>  SD.A(  15  ) , 

SD.NUMB  ->  SD.B, 

RESULT  =>  PARTIAL.K  15  )  ) ; 


SL2_ADDER 

generic  map  (  TECHNOLOGY.SCALE  =>  TECHNOLOGY.SCALE  ) 
port  map  <  PARTIAL.H  *>  PARTIAL.K  0  ), 

PARTIAL.L  =>  PARTIAL.K  1  ), 

P.out  =>  PARTIAL.2(  0  )  ); 


SL2.ADDER 

generic  map  (  TECHNOLOGY.SCALE  »>  TECHNOLOGY.SCALE  ) 
port  map  (  PARTIAL.H  *>  PARTIAL.K  2  ), 

PARTIAL.L  »>  PARTIAL.K  3  ), 

P.out  »>  PARTIAL.2(  1  )  ); 


SL2.ADDER 

generic  map  (  TECHNOLOGY.SCALE  *>  TECHNOLOGY.SCALE  ) 
port  map  (  PARTIAL.H  «>  PARTIAL.K  4  ), 

PARTIAL.L  »>  PARTIAL.K  5  ), 

P.out  =>  PARTIAL.2C  2  )  ); 
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A13  :  SL2.ADDER 

generic  map  (  TECHNOLQGY.SCALE  «>  TECHNOLOGY.SCALE  ) 
port  map  (  PARTIAL.H  ■>  PARTIAL.K  6  ), 

PARTI AL.L  «>  PARTIAL. 1(  7  ), 

P.out  «>  PARTIAL_2(  3  )  ); 

A 14  :  SL2_ ADDER 

generic  map  (  TECHNOLOGY.SCALE  «>  TECHNOLOGY.SCALE  ) 
port  map  (  PARTIAL.H  «>  PARTIAL. 1(  8  ), 

PARTI AL.L  ■>  PARTIAL. 1(  9  ), 

P.out  *>  PARTIAL.2(  4  )  ); 

A 15  :  SL2. ADDER 

generic  map  (  TECHNOLOGY.SCALE  =>  TECHNOLOGY.SCALE  ) 
port  map  (  PARTIAL.H  »>  PARTIAL. 1(  10  ), 

PARTIAL.L  *>  PARTIAL.K  11  ), 

P.out  *>  PARTIAL.2(  5  )  ); 

A 16  :  SL2. ADDER 

generic  map  (  TECHNOLOGY.SCALE  «>  TECHNOLOGY.SCALE  ) 
port  map  (  PARTIAL.H  =>  PARTIAL.K  12  ), 

PARTIAL.L  »>  PARTIAL.K  13  ), 

P.out  »>  PARTIAL.2(  6  )  ); 

A17  :  SL2.ADDER 

generic  map  (  TECHNOLOGY.SCALE  <=>  TECHNOLOGY.SCALE  ) 
port  map  (  PARTIAL.H  *>  PARTIAL.K  14  ), 

PARTIAL.L  *>  PARTIAL.K  15  ), 

P.out  =>  PARTIAL.2(  7  )  ); 

A20  :  SL3_ADDER 

generic  map  (  TECHNOLOGY.SCALE  «>  TECHNOLOGY.SCALE  ) 
port  map  (  PARTIAL.H  »>  PARTIAL.2(  0  ) , 

PARTIAL.L  =>  PARTIAL.2(  1  ), 

P.out  *>  PARTIAL.3(  0  )  ) ; 

A21  :  SLS.ADDER 

generic  map  (  TECHNOLOGY.SCALE  =>  TECHNOLOGY.SCALE  ) 
port  map  (  PARTIAL.H  «>  PARTIAL.2(  2  ), 

PARTIAL.L  ->  PARTIAL.2(  3  ), 

P.out  »>  PARTIAL.3(  1  )  ) ; 

A22  :  SL3_ADDER 

generic  map  (  TECHNOLOGY.SCALE  »>  TECHNOLOGY.SCALE  ) 
port  map  (  PARTIAL.H  »>  PARTIAL.2(  4  ), 
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PARTIAL.L  ->  PARTIAL.2(  5  ), 
P.out  ■>  PARTIAL_3(  2  )  ); 


A23  :  SL3.ADDER 

generic  map  (  TECHMOLOGY.SCALE  »>  TECHMOLOGY.SCALE  ) 
port  map  (  PARTI AL_H  «>  PARTIAL.2(  6  ) , 

PARTIAL.L  »>  PARTIAL.2(  7  ). 

P.out  ->  PARTIAL.3(  3  )  ); 

A30  :  SL4.ADDER 

generic  map  (  TECHNOLOGY.SCALE  *>  TECHNOLOGY.SCALE  ) 
port  map  <  PARTIAL.H  »>  PARTIAL.3(  0  ), 

PARTIAL.L  ->  PARTIAL.3(  1  ), 

P.out  »>  PARTIAL.4(  0  )  ) ; 

A31  :  SL4_ ADDER 

generic  map  (  TECHMOLOGY.SCALE  »>  TECHNOLOGY.SCALE  ) 
port  map  (  PARTIAL.H  »>  PARTIAL.3(  2  ), 

PARTIAL.L  *>  PARTIAL_3(  3  ), 

P.out  »>  PARTIAL.4(  1  )  ) ; 


A4  ;  SL5.ADDER 

generic  map  <  TECHNOLOGY.SCALE  *>  TECHNOLOGY.SCALE  ) 
port  map  (  PARTIAL.H  =>  PARTIAL.4(  0  ), 

PARTIAL.L  a>  PARTIAL.4(  1  ), 

P.out  =>  SD.out  ) ; 


end  Structural; 
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use  work. SD.DEFINITIONS. all; 
entity  SD_MULT_TB  is 
end  SD.MULT.TB; 


use  work. SD.DEFINITIONS. all; 

use  work. TB.PACKAGE. all; 

architecture  TEST.SD.MULT  of  SD.MULT.TB  is 


component  SD.MULT 

generic  (  TECHNOLOGY.SCALE  :  real  :»  1.0  ); 
port  (  SD.A  :  in  SD.NUMBER; 

SD.B  :  in  SD.NUMBER; 

SD.out  :  out  PARTIAL.?  (  0  to  31  )  ); 
end  component; 

for  all  :  SD.MULT  use  entity  work.SD_MULT(Structural) ; 


signal  NUMBER. A,  NUMBER.B  :  SD.NUMBER; 
signal  RESULT  :  PARTIAL.?  (  0  to  31  ) ; 
signal  A.VALUE  :  real  :=  0.0; 
signal  B. VALUE  ;  real  :*  0.0; 
signal  R.VALUE  :  real  :=  0.0; 

alias  RESULT.H  :  PARTIAL.?  (  0  to  15  )  is  RESULTC  0  to  15) ; 

alias  RESULT.!  :  PARTIAL.?  <  0  to  15  )  is  RESULTC  16  to  31  ); 

alias  SD.RESULT  :  PARTIAL.?  (  0  to  15  )  is  RESULTC  1  to  16  ) ; 

begin 

MIO  :  SD.MULT 

generic  map  C  TECHNOLOGY.SCALE  *>  1.0  ) 
port  map  C  SD.A  =>  NUMBER.A, 

SD.B  »>  NUMBER.B, 

SD.out  *>  RESULT  ); 


NUMBER.A  <=  SD.MAKEC  A.VALUE  ); 
NUMBER.B  <=  SD.MAKEC  B.VALUE  ); 


A. VALUE  <=  1.0  after  200  ns,  0.5  after  300  ns,  -0.50 

-1.0  after  700  ns,  0.9  after  900  ns,  0.99 

B. VALUE  <=  1.0  after  100  ns,  0.5  after  400  ns,  -0.50 

0.1  after  800  ns,  0.9  after  1000  ns,  0.99 
-1.0  after  1300  ns; 

R.VALUE  <=  SD.TO.REALCSD.RESULT) ; 


after  500  ns , 
after  1100  ns; 
after  600  ns, 
after  1200  ns , 


I 

end  TEST.SD.MULT;  j 
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Multiplier  Unit  report" 

Vhdl  Simulation  Report 

Report  Name:  Multiplier  Unit  report" 
Kernel  Library  Name:  «PETERSOH»TEST.SD.MULT 
Kernel  Creation  Date:  APR- 13- 1989 
Kernel  Creation  Time:  10:25:39 
Run  Identifer:  1 
Run  Date:  APR-13-1989 
Run  Time:  10:25:39 

Report  Control  Language  File:  mult^report .rcl 
Report  Output  File  :  mult_report .rpt 

Max  Time:  9223372036864775807 
Max  Delta:  2147483646 


Report  Control  Language  : 

Simulation.report  MULT.report  is 
begin 

Report.name  is  "Multiplier  Unit  report"; 

Page.width  is  80; 

Page.length  is  50; 

Signal.format  is  vortical; 

Sample.signals  by.event  in  ns; 

Select.signal  :  A.VALUE; 

Select.signal  :  B.VALUE; 

Select.signal  :  R.VALUE; 

end  MULT.report; 

Report  Format  Information  : 

Time  is  in  NS  relative  to  the  start  of  simulation 

Time  period  for  report  is  from  0  NS  to  End  of  Simulation 

Signal  values  are  reported  by  event  (  '  ’  indicates  no  event 


PAGE  1 


C-58 


APR- 13- 1989  10:46:57  VHDL  Report  Generator 

Multiplier  Unit  report" 


TIME  1  — 
1 

- SIGNAL  NAMES- 

1 

(NS)  1 

A 

B 

1 

.. 

1 

V 

V 

1 

A 

A 

1 

L 

L 

1 

U 

U 

1 

1 

E 

E 

1 

0  1 

O.OOOOOOE+00 

O.OOOOOOE+00 

100  1 

l.OOOOOOE+00 

200  1 
234*  1 

+3  1 

l.OOOOOOE+00 

300  1 

335*  1 

+3  1 

339  1 

+3  1 

340*  1 

+3  1 

5.000000E-01 

400  1 

439  1 

+3  1 

440*  1 

+3  1 

442*  1 

+3  1 

5.000000E-01 

500  1 

542*  1 

+3  1 

600  1 
642*  1 

♦3  1 

700  1 

739  1 

+3  1 

740*  1 

+3  1 

742*  1 

*^L^^^lt******* 
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R 

V 

A 

L 

U 

E 

O.OOOOOOE+00 

l.OOOOOOE+00 

O.OOOOOOE+00 

5.000000E-01 

l.OOOOOOE+00 

O.OOOOOOE+00 

2,500000E-01 

2.500000E-01 

iHt********** 

7.500000E-01 
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TIME  1 . 

1 

- SIGNAL  NAMES . 

. 1 

1 

(NS)  1 

1 

A 

B 

R 

1 

1 

V 

V 

V 

1 

A 

A 

A 

1 

L 

L 

L 

1 

U 

U 

U 

1 

1 

E 

E 

E 

1 

+3  1 

5.000000E-01 

800  1 

l.OOOOOOE-01 

839  1 

+3  1 

8.750000E-01 

840*  1 

+3  1 

iH:ifi********* 

843*  1 

•*■2  i 

848*  1 

♦2  1 

863*  1 

♦1  1 

*i4i«***«**4ti|i« 

858*  1 

♦  1  1 

900  1 

9.000000E-01 

939  1 

+3  1 

1.500000E-01 

940*  1 

♦3  1 

8.750000E-02 

947*  1 

+1  1 

8.750000E-02 

♦2  1 

9.142151E-02 

950  1 

+  1  1 

9.142151E-02 

+2  1 

8,977165E-02 

951*  1 

♦2  1 

9.001579E-02 

953*  1 

+1  1 

9.001579E-02 

+2  1 

9.000053E-02 

954*  1 
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1 

1 

~dXOII  AI*  AAnCaO  ' 

(NS)  1 

1 

A 

B 

R 

1 

1 

V 

V 

V 

1 

A 

A 

A 

1 

L 

L 

L 

1 

U 

U 

U 

1 

1 

E 

E 

E 

1 

+1  1 

9.000006E-02 

957*  1 

+  1  1 

9.000006E-02 

958*  1 

+  1  1 

9.000000E-02 

959*  1 

+  1  1 

9.000000E-02 

961  1 

+  1  1 

9.000000E-02 

962*  1 

+  1  1 

9.000000E-02 

963*  1 

+  1  1 

9.000000E-02 

1000  1 

9.000000E-01 

1034*  1 

+3  1 

1.090000E+00 

1039  1 

+3  1 

5.900000E-01 

1040*  1 

+3  1 

8.400000E-01 

1043*  1 

+  1  1 

8.400000E-01 

+2  1 

8.400610E-01 

1048*  1 

+  1  1 

8.400610E-01 

+2  1 

8.088110E-01 

1050  1 

+  1  1 

8.088110E-01 

+2  1 

8.09027SE-01 

1052*  1 

♦1  1 

8.090275E-01 
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TIME  1 . 

1 

- SIGNAL  NAMES . 

. 1 

1 

(NS)  1 

1 

A 

B 

R 

1 

1 

V 

V 

V 

1 

A 

A 

A 

1 

L 

L 

L 

1 

U 

U 

U 

1 

1 

E 

E 

E 

1 

+2  1 

8.100041E-01 

1053*  1 

+1  1 

8.100041E-01 

+2  1 

8.100003E-01 

1054*  1 

+  1  1 

8.100003E-01 

1058*  1 

+1  1 

8.100000E-01 

1059*  1 

+  1  1 

8.100000E-01 

1061  1 

*1  1 

8.100000E-01 

1063*  1 

+  1  1 

8.100000E-01 

1100  1 

9,900000E-01 

1139  1 

+3  1 

9.350000E-01 

1143*  1 

+3  1 

8.725000E-01 

1145*  1 

+2  1 

8.726678E-01 

1146*  1 

+2  1 

8.724237E-01 

1147*  1 

+2  1 

9.036737E-01 

1148*  1 

♦2  1 

8.919550E-01 

1150  1 

♦1  1 

8.9195S0E-01 

♦2  1 

8.919464E-01 

1152*  1 
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TIME  1 . 

1 

- SIGMAL  NAMES - 

- . 1 

1 

(NS)  1 

1 

A 

B 

R 

I 

1 

V 

V 

V 

1 

A 

A 

A 

1 

L 

L 

L 

1 

U 

U 

U 

1 

1 

E 

E 

E 

1 

+2  1 

8.909698E-01 

1153*  1 

+1  1 

8.909698E-01 

♦2  1 

8.910003E-01 

1154*  1 

+1  1 

8.910003E-01 

+2  1 

8.909994E-01 

1156*  1 

+1  1 

8.909994E-01 

1158*  1 

+  1  1 

8.910000E-01 

1159*  1 

♦1  1 

8.910000E-01 

1161  1 

+  1  1 

8.910000E-01 

1162*  1 

♦1  1 

8.910000E-01 

1162*  1 

♦1  1 

8.910000E-01 

1163*  1 

♦1  1 

8.910000E-01 

1164*  1 

+  1  1 

8.910000E-01 

1200  1 

9.900000E-01 

1239  1 

+3  1 

1.016000E+00 

1243*  1 

+2  1 

9.808438E-01 

1248*  1 

♦1  1 

9.808438E-01 

+2  1 

9.808476E-01 
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TIME  1 . 

1 

- SIGNAL  NAMES . 

. . 1 

1 

(NS)  1 

1 

A 

B 

R 

1 

1 

V 

V 

V 

1 

A 

A 

A 

1 

L 

L 

L 

1 

U 

U 

U 

1 

1 

E 

E 

E 

1 

1250  1 

♦2  1 

9.810764E-01 

1252*  1 

♦2  1 

9.800999E-01 

1253*  1 

+1  1 

9.800999E-01 

1258*  1 

+  1  1 

9.801000E-01 

1259*  1 

♦1  1 

9.801000E-01 

1261  1 

+  1  1 

9.801000E-01 

1262*  1 

+  1  1 

9.801000E-01 

1263*  1 

+1  1 

9.801000E-01 

1300  1 

1334*  1 

♦3  1 

************ 

1343*  1 

♦1  1 

************ 

+2  1 

************ 

1348*  1 

+2  1 

************ 

1350  1 

+2  1 

************ 

1351*  1 

*2  1 

************ 

1353*  1 

+  1  1 

************ 

1354*  1 
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